Cataloging Futures explored this resounding question last week “What happens to cataloging when the majority of the library’s books (and other resources) already come with metadata?”
I recently had the opportunity to consider automatic metadata extraction and embedded metadata and I feel compelled to voice some of the same skepticism about certain interoperability and consistency issues that I have with metadata creation in general. While a smart computer like IBM’s Watson could take some of the subjectivity out of applying metadata, is anyone ever going to agree on a consistent format, scheme and standards?
If the entire world of objects could be cataloged only using Dublin Core, for instance, automation would be perfectly fine. Coupled with XML and some extended controlled vocabulary standards, a system could be developed that would standardize the world. But, has that ever happened yet even in analog? No.
Someone, a person, still needs to do the thinking behind the automation of creation or extraction. Decisions will be required and people will have to make these determinations. And, those people will still disagree and tinker and innovate and discuss metadata and cataloging.
Yesterday I spent a good chunk of time trying to figure out the best model for my new company’s floating libraries for eBook lending. In a sense, we are lucky as a for profit corporation because we can explore cross-promotional partnerships and other such avenues in addition to traditional non-profit licensing and DRM situations.
However, I was still flabbergasted by some really devastating news brought to light by both Librarian by Day and Go the Hellman.
To me, two big things stand out on the Overdrive update:
- OverDrive will communicate a licensing change from a publisher that, while still operating under the one-copy/one-user model, will include a checkout limit for each eBook licensed.
- OverDrive wants library partners to cooperate to honor geographic and territorial rights for digital book lending, as well as to review and audit policies regarding an eBook borrower’s relationship to the library.
So, simply put, you can pay for a book that can only be checked out a certain number of times and only by patrons with a particular relationship to the library.
All of this stems from changes to Harper Collins policies and the Hellman blog refers to this as the “Pretend It’s Print” model which is an easy way to put it. Pretend your library owns a print copy of an item that is popular. Well, it gets worn out, doesn’t it?
Wear and tear prompts the library to purchase a new item to replace the worn copy. Thus, the library spends more money on a print copy anyway and the necessity of replacement yields even more profit. Paper is not eternal, after all.
Perhaps Watson the IBM wunder-computer can figure out the solution for us all?
What are you going to talk about this weekend? Here are five conversation starters:
- Who knew there were so many cool vintage bookmobiles?
- Want to act like a creative director? Simply apply design-thinking to solve library issues.
- Do you believe in serendipity when browsing or searching collections?
- Did you know you can tour some collections from the National Gallery of Art online as well as perform research on collection materials?
- What are some barriers to digital ingest?
Andromeda Yelton has covered one of my favorite topics on her blog. In Controlled Vocabulary vs. Tagging: Three Fallacies, she raises some valid and important points. For one thing, she readily admits that when information professionals debate the taxonomy/folksonomy issue they are mostly concerned with how academic researchers will locate material.
Also, in coming up with hybrid solutions that incorporate control and freedom, librarians are often hampered by the pie in the sky dream of a single search (probably a single Google-like search). This idea of unified perfection, that if things are arranged just so, we will realized optimal, nirvana-like findability just does not work in practical application with real world systems and those pesky imperfect humans creating metadata and tags.
Finally, library and information professionals seem to like certainty. Andromeda addresses this idea of guarantees with this astute comment, “I think librarians often assume that, because they are based in rules, controlled vocabulary offers guarantees about metadata completeness or correctness, and tagging cannot.”
There are no guarantees in life, even with controlled vocabulary. In my humble opinion, we should stop fantasizing and start realizing the richness of combining the two methods, mining user generated data and continuing to employ standards for a baseline of control. See more at InfoCamp Berkeley March 5th in my session called You Say Tomato, I Say Aardvark: Taxonomy/Folksonomy Throwdown.
Don’t worry. The John and Mable Ringling Museum of Art online collections contain much more than just posters featuring terrifying demonic clowns. Although there are some fascintating artifacts of circus history, the Ringlings were collectors of a plethora of art treasures. In fact, the John and Mable Ringling Museum of Art has an internationally recognized collection of more than 10,000 objects, dating from ancient to contemporary times.
The metadata is minimal, but effectively credits the artist, provides a data and some context. The collection site is well organized by category as well. And, if you happen to find yourself in Sarastoa, Florida you can visit in person to see it all.
I want to rave about another local library system – the King County Library System. KCLS has a really cool program right now called “Take Time to Read.” With clever incentives like Starbucks gift cards and other prizes, this is a reading program for adults to indulge in the luxury of reading. Do you like luxury? I do.
Also, big framed book covers are going to start popping up in King County for the springtime series of Book Cover Walking Tours. Pretty inventive.
KCLS has other nifty everyday luxuries as well like the Library Elf – a clever reminder system that tracks due dates and stuff.
Am I the only one that noticed the amazing parallels between the IBM computer on Jeopardy! named Watson and the 1957 movie classic Desk Set?
In the film, Katherine Hepburn’s character named Bunny Watson (coincidence?) runs the reference department at a fictitious television broadcasting company that today would be called NBC. Spencer Tracy’s character Richard Sumner invents a giant computer called EMERAC that can answer reference questions. That’s right – over fifty years ago this monster machine with tape reels and punch cards allegedly had the ability to parse questions into answerable search queries.
Anyway, the other notable events in the film include Bunny Watson exclaiming that the computer gives her the feeling that “maybe, just maybe people are becoming a bit outmoded” as well as the happy ending where Sumner reveals that EMERAC “was never meant to replace you girls, merely to free up your time for more valuable research.”
While Watson performed amazingly well on Jeopardy!, there were still inhuman style glitches. An optimist (or an IBM executive) might say that the implications of this type of technology is to assist humans with memory related tasks. A pessimist might say that reference librarians should be a little nervous.
Here are two interesting articles on the subject. One from the NY Times actually describes what is essentially a reference interview when describing what Watson is capable of. The other is the optimist viewpoint (sort of) and imagines how a resource like Watson could assist overtaxed librarians.
And, if you haven’t seen Desk Set, you really should. It is a Play Now feature on Netflix.