
Thursday July 13, 2006
Future - and Old Books, part 4
Here is a case where networked digital books likely would have been of great assistance to an individual. My friend and colleague Keith Morgan sent me an article a few days ago called The Poet of Dielectics that analyzed Marx's Das Kapital as a work of literature rather than as a piece of economic theory as is usually the case. See http://books.guardian.co.uk/review/story/0,,1814909,00.html
The article claims that
As a student Marx was infatuated by Tristram Shandy, and 30 years later
he found a subject which allowed him to mimic the loose and disjointed
style pioneered by Sterne. Like Tristram Shandy, Das Kapital is full of
paradoxes and hypotheses, abstruse explanations and whimsical
tomfoolery, fractured narratives and curious oddities. How else could
he do justice to the mysterious and often topsy-turvy logic of
capitalism?
As I've never read Das Kapital, I was very surprised to find that Marx had read and was quoting from and incorporating huge varieties of classical and other literary sources. Here's a sampling (the paragraphs are a bit out of order from the original article to make this flow better):
At university, Marx "adopted the habit of making extracts from all the
books I read" - a habit he never lost. A reading list from this period
shows the precocious scope of his intellectual explorations. While
writing a paper on the philosophy of law he made a detailed study of
Winckelmann's History of Art, started to teach himself English and
Italian, translated Tacitus's Germania and Aristotle's Rhetoric, read
Francis Bacon and "spent a good deal of time on Reimarus, to whose book
on the artistic instincts of animals I applied my mind with delight".
This is the same eclectic, omnivorous and often tangential style of
research which gave Das Kapital its extraordinary breadth of reference.
..."They are my slaves," he [Marx] would sometimes say, gesturing at the books on
his shelves, "and they must serve me as I will." The task of this
unpaid workforce was to provide raw materials which could be shaped for
his own purposes. "His conversation does not run in one groove, but is
as varied as are the volumes upon his library shelves," wrote an
interviewer from the Chicago Tribune who visited Marx in 1878. In 1976
SS Prawer wrote a 450-page book devoted to Marx's literary references.
The first volume of Das Kapital yielded quotations from the Bible,
Shakespeare, Goethe, Milton, Voltaire, Homer, Balzac, Dante, Schiller,
Sophocles, Plato, Thucydides, Xenophon, Defoe, Cervantes, Dryden,
Heine, Virgil, Juvenal, Horace, Thomas More, Samuel Butler - as well as
allusions to horror tales, English romantic novels, popular ballads,
songs and jingles, melodrama and farce, myths and proverbs.
...Like Frenhofer, Marx was a modernist avant la lettre. His famous
account of dislocation in the Communist Manifesto - "all that is solid
melts into air" - prefigures the hollow men and the unreal city
depicted by TS Eliot, or Yeats's "Things fall apart; the centre cannot
hold". By the time he wrote Das Kapital, he was pushing out beyond
conventional prose into radical literary collage - juxtaposing voices
and quotations from mythology and literature, from factory inspectors'
reports and fairy tales, in the manner of Ezra Pound's Cantos or
Eliot's The Waste Land. Das Kapital is as discordant as Schoenberg, as
nightmarish as Kafka.
...To prove that money is a radical leveller, Marx quotes a speech from
Timon of Athens on money as the "common whore of mankind", followed by
another from Sophocles's Antigone ("Money! Money's the curse of man,
none greater! / That's what wrecks cities, banishes men from home, /
Tempts and deludes the most well-meaning soul, / Pointing out the way
to infamy and shame . . ."). Economists with anachronistic models and
categories are likened to Don Quixote, who "paid the penalty for
wrongly imagining that knight-errantry was equally compatible with all
economic forms of society".
No wonder it took him 10 years or more to write Das Kapital. Imagine the copying and the prodigious memory to be able to pull all of those varied sources together. Work like this could be made easier with full text searching, digital content, and indexing. While it takes a rare mind to be able to do anything meaningful with all that content, by providing exposure to as many sources as possible to as many people at possible and giving them at least the chance to read and think and make something new out of any or all of it, someone will do something that changes the world, in big ways or small. That seems to be a part of the golden dream of networked books. That's the part I fully believe in and hope to see happen.
Thanks Keith for passing that article on.
Posted by WARREN, SCOTT
| Jul 13 2006, 02:06:42 PM EDT
| Permalink
|

Wednesday July 12, 2006
Future books, part 3
Indulge me while I flog the
networked books horse some more. I?ll warn you up front that this is a
relatively long post (were you expecting something short?). Jeff Jarvis of Buzzmachine a few months back had a lot to
say about the future social possibilities of the book, but was unnecessarily
critical of books as they now exist: http://www.buzzmachine.com/index.php/2006/05/19/the-book-is-dead-long-live-the-book/
Here's what he said:
The problems with books are many: They are frozen in time without the
means of being updated and corrected. They have no link to related
knowledge, debates, and sources. They create, at best, a one-way
relationship with a reader. They try to teach readers but don?t teach
authors. They tend to be too damned long because they have to be long
enough to be books. As David Weinberger taught me, they limit how
knowledge can be found because they have to sit on a shelf under one
address; there?s only way way to get to it. They are expensive to
produce. They depend on scarce shelf space. They depend on blockbuster
economics. They can?t afford to serve the real mass of niches. They are
subject to gatekeepers? whims. They aren?t searchable. They aren?t
linkable. They have no metadata. They carry no conversation. They are
thrown out when there?s no space for them anymore. Print is where words go to die.
Wow! No metadata huh? Aside from indices I guess. They carry no conversations? They have no link to related knowledge, debates, and sources? Aside from footnotes and credits I guess.
A good response to Jarvis came
from K. G. Schneider, whom I admire quite a lot:
?Print is where words go to
die?: that depends on the genre. A textbook you might be pressured into writing
for your fall class? That could be short-lived, or even (like the first
technology title I penned) DOA. But ?Pride and Prejudice? isn?t dead, and it
fully participates in a long conversation, continuing all the way to ?The Jane
Austen Book Club? and no doubt beyond.
It may well be that novels and
creative nonfiction move from dead trees to living bytes. Like many librarians,
I don?t have a container fetish, so that?s fine?maybe even better, what with
shelf space and old-growth forests and whatnot (though I do like writing in my
own books, and would expect an electronic book to be as easy to annotate).
Also, we in LibraryLand let David Weinberger think he invented this idea because
he?s such a nice guy, but we *already know* how frustrating it is?and how
limiting?that a book can only be in one physical location at a time. (I manage
a digital library where infinite points of access are part of the satisfying
experience.)
I also anticipate that new media
will birth new genres, some more participatory and interactive than others. I
adore recipes on Epicurious because food preparation is a great example of a
running conversation, and I find my own cookbooks far too silent as a result.
But sometimes?as with the
storyteller around the fire, or the children?s librarian with the hand puppets,
or a writer such as Jane Austen?we want the author to tell the tale. (Consider
how grimly awful most fanfic is.) Let each genre find its natural homes, as
future formats allow, and let new genres spring forth from the fertile fields
of human creativity. [emphasis mine].
I must say that I now find myself
mostly dismayed by Jarvis and Kelly. I want to keep an open mind to new
developments, but frankly their attitudes towards present day books and the
implications of dismissal that seem to be there for those of us who read books
turn me off so much. When Jarvis says things
like: ?they create, at best, a one-way relationship with a reader,? ?They try
to teach readers but don?t teach authors,? and ?They tend to be too damned long
because they have to be long enough to be books.?, I just wonder what's he thinking? Here's what I thought about those three particular statements that Jarvis made.
So what exactly is wrong with a
one-way relationship with a reader? Last night I finished Virginia Woolfe?s The Years. What relationships should I exactly be expected to form here? This
wonderful novel is a nuanced and beautifully written exploration of what
constitutes memory and family life set in a particular time and milieu. I am
forming an ongoing relationship with it based on my own thoughts, past experiences,
and readings of other Woolfe works. What is it that I am missing? In the
networked book future, I suppose I would be chopping out segments of the novel
and perhaps ?doing? something with them.
But I already am, just not
digitally. And there?s the crux for Kelly and Jarvis and company. If it isn?t
done digitally, they seem to become really, really upset to the point of being
petulant. I?m thinking and having emotional reactions to what I?m reading and
the online remixing, sharing, and annotating that are so highly touted as the
added value that will become normative parts of books of this envisioned future
seems to me to offer a paltry return compared to whatever thoughts and feelings
plain print books can already engender within me. I?d rather actually read or
view than annotate and compare lists. Maybe these bonus activities will be more
applicable to professional reading and reading for work where segments may
matter more than a whole continuous and contiguous work, but there?s a lot of
reading going on for just pleasure too where a self-contained narrative is just
fine, thank you.
As for the second point about
teaching the author, Virginia Woolfe is dead. What am I supposed to teach her?
If she were alive, would I need to interact with her to have a relationship
with her work? I?ve heard several authors talk and frankly often find my
interaction with them via the page to be far richer than whatever supposed
interactions people like Kevin Kelly insist that I must soon have.
Were I a fiction author, I?m not
at all sure I would want to be in constant dialog with readers who supposedly have
something to teach me ('Hey Dan Brown, listen up, you are a really lousy writer.'). Not
because I?m necessarily better or smarter than they are, but I simply don?t
have the time or desire to be in constant revision, discussion,or explanation. Sometimes
works should just stand as they are. Updike says this pretty well in his
editorial and I think he?s right on the mark there.
Experiments like Mackenzie Work?s
GAM3R 7H30RY which are all about participation are fine (though frankly that
effort appears to me to be more like a pimped-out wiki with an open-post blog
attached to it than a book), but I doubt most authors want that much ongoing
revision. If they do, then networked books will be their medium. However,
participation and feedback already occur. Most books are edited, many books
have acknowledgments to friends and colleagues who parsed some part of the
draft. Just how wide and open does the circle of participation have to be? Maybe there's room for different standards?
As for his comment that books are
too long, that says more to me about Jeff Jarvis than it does about books. His
brave new world seems to reward those who do not want to (or can?t) consume
longer works, but want to work with short segments and remix or tag or do
things to them other than just reading and thinking about them. That?s ok, go
at it, but it seems bizarre to say that books are just too plain long, period.
You?ll get me to take you somewhat seriously by not issuing blanket edicts like
that. I usually associate that attitude with reviews of classics in Amazon written
by self-righteously outraged high school students frustrated at having to read
anything longer and more sophisticated than the latest text message they?ve
received.
It?s neat that there are people
who want to and like to remix and annotate and create new types of expression via
social interactions with text. Some incredible things will likely be done
someday and I applaud those who will bring their creativity to bear. It?s just
that many of
us, I?m guessing, are still content to read as we presently do. It works - really well.
Posted by WARREN, SCOTT
| Jul 12 2006, 02:16:05 PM EDT
| Permalink
|

Tuesday July 11, 2006
Future books, part 2.
I think what is happening with the networked books debate is
that Kelly and Jeff Jarvis and the other luminaries of the future book crowd
have heretofore mostly been preaching to the choir. Their exhortations and
descriptions of what could happen ? too often unfortunately and naively phrased
as what will happen ? have fallen on receptive ears and screens via
Wired and If: Book and venues like that. So when the conversation went beyond
those safe and already converted crowds to the population at large, things all
of a sudden got messy. Not everyone by a long shot agreed with the script being
written. And then to be very publicly rebuked and soundly dressed down by a figure
like John Updike, who carries far more cultural and intellectual capital among
a wider swath than Kelly or Jarvis can lay claim to, well, it probably stung like the
dickens.
Here are some quotes from discussions taking place on If:
book. Speberg?s comments are what led me to believe that they had never really
had any real dissension before.
For Updike and all those unable to cross into the
new Canaan of electronicity, the apotheosis of the artist fits into the
tradition of history as a history of heroes?
But it doesn't seem fruitful to talk about Updike's
writing or rank in the Top 100 Writers list. Instead, let me repeat that his
remarks clearly demonstrate a complete lack of shared values, language and
experience with those who are interested
in moving to the book we will all read in the future. [Emphasis mine ? the future
is already worked out and decided upon by Roger Sperberg. 'We will all read...' Nice to know.]
To paraphrase something I wrote elsewhere the books in the Library of the Future will be [there's that will be again] more like Paul
Ford?s Ftrain than like anything in Updike's
oeuvre. Everything he writes, however brilliant it is in comparison to
contemporary work, will appear to the future as flat and two-dimensional as all
the art before Giotto and Duccio. Updike doesn't know how to access those other
dimensions (me neither ? but at least I'm aware of them) and he will always be
on the one side of a very clear demarcation in the history of writing.
Posted by: Roger Sperberg at June 7, 2006
06:17 PM
That?s pretty strong determinist thinking there. It could be
true. But it isn?t guaranteed by a long shot. If there?s one thing I try to
avoid doing, it?s predicting the future. We were all supposed to be taking
trips to the moon in our private rockets for vacations by now too and living in
those dreadful Modernist concrete monstrosities designed by Le Corbusier.
Now compare that to the following thoughtful piece, also
posted on If: Books. Eddie Tejeda, whoever he is, is clearly thinking about books. Sperberg by
comparison has an agenda that has been disrupted.
I really enjoyed Updike's essay. I don't think he
is either denying what is happening to the book (the "book" as we
know it) and I do not think he is on a crusade to try and save the book. I
think he is simply acknowledging the changes to the book and I think he has a
honest concern of what might lost in the transition of moving ideas to the web,
especially from someone who's life has been about books.
I don't think he is trying to hold back what
appears to be progress the way we share ideas. The benefits of the web are
enormous! and it's hard to imagine ever trying to revert it...
But, like Updike, who doesn't acknowledge what is
gained, I think it's important to also acknowledge what might be lost. I often
say that I read the news, facts and interesting ideas on the web all day and I
am rarely satisfied! Thats my life. That is what I do. I read stuff on the web.
Usually interesting stuff. But when I pick up one book, my life changes. Almost
every time! When I finish a (good) book it almost always has a profound effect
of me. I think about the ideas in the book a lot! And the thoughts never fade. Books change the way I think. The internet
fills me up with facts. [emphasis mine]
In the web I can read about the Ottoman Empires, I
find out who acted in what movie, and I can find out details on the collapse of
the Argentinean economy in seconds, and now I often say I have a hard time
imagining not having the internet to answer many of my questions. I joke: Before
the internet, what did people do when someone said an ambiguous or incorrect
statement? Unless you bothered going to the library every time someone said a
strange "fact", how would you know if it's true? Did you just accept
it? Who bothered doing "research"? That world now seems distant to
me.
But I wonder, as it appears Updike does, wether
that profound moment you have after reading book is lost. Will it be replaced
with technology? maybe... until then..I think it's fair to lament what might be
lost.
Posted by: Eddie A. Tejeda at June 27, 2006
08:49 PM
Thanks Eddie for helping me think a bit too. I think we may gain lots of good new things with networked books and lots of them we probably haven't yet anticipated, but it doesn't mean we have to, or want to, throw away or give up what's already good about books now. It doesn't have to be an absolutist one or the other kind of situation. That just doesn't make sense.
Posted by WARREN, SCOTT
| Jul 11 2006, 10:10:51 AM EDT
| Permalink
|

Monday July 10, 2006
Snippets
Text snippets are different from abstracts and summaries because they are algorithmically extracted from the source text, rather than editorially created to function as a summary or teaser. For example, compare the news headline treatments of Google News with The New York Times online. In its headline blurbs, Google News uses the beginning of the source news article up to a prescribed number of words or characters, as the snippet. The New York Time blurb is hand authored, and functions as a traditional abstract. The Google News approach arguable employs the most common snippet heuristic, employed in RSS feeds, blog comments, product reviews, etc. The presumption here, I think, is that the beginning of the text is the most useful part of the text to use in the snippet.
Search engines often employ a different method for generating snippets in search results. Google results typically contain auto-generated snippets derived by extracting and combining sentence fragments from the indexed webpage that contain the keyword(s) searched for by the user. This turns out to be useful method for generating teaser text because it literally puts the keyword in context. A similar method of generating snippets is used in Google Book Search.
Recently I learned of The Final Word, a self-described media experiment, that presents New York Times headlines by conjoining the headline with the last paragraph of the Times article. In other words, the "punchline" is used as a teaser for the article. In some cases the last paragraph functions as a true summary. In other cases the last paragraph consists only of a pithy quote. It's unclear to me how useful this is for scanning headlines, but it does make me think that snippet generation is more of an art than a science.
I can imagine a variety of algorithms and heuristics for generating snippets that are more or less useful for specific audiences, or specific types of content. A simple example is competitive intelligence. Corporations have an interest in what their competitors are up to, and are especially interested in news where their own corporation is mentioned, even in cases when they are not the focus of the article. In this context it would be useful to summarize the article by conjoining sentences containing the company names (self and competitors), perhaps highlighting article headlines that contain both. For reviews I wonder if adjectives could play a useful role in snippet generation.
It also seems to me that there is a big difference between creating summaries and creating teaser text.
Can you think of other methods for generating snippets? Are snippets evil?
Posted by Tito Sierra
| Jul 10 2006, 01:28:25 PM EDT
| Permalink
|
Future books
I?ve been reading a lot recently about books, what they are,
and what they can and perhaps might become.
There?s a huge amount of editorial content ranging from the
possibilities that digitizing content a la Google might make to the more arcane
experiments in form and definition of the book itself such as Mackenzie Work?s GAM3R
7H30RY, http://www.futureofthebook.org/gamertheory/
(There?s a nice summary at http://www.laweekly.com/art+books/books/writing-in-public/13910/).
Two recent editorials in the NYTs garnered a lot of
attention. Kevin Kelly?s essay, ?Scan this Book? presented what seemed to me to
be a highly utopian view of digitized books where everything that was possible ?
and hence in his view desirable ? rested upon social networking. It seemed that
there was time for everything except perhaps actually reading content straight
through and reflecting on it. John
Updike?s speech at the Book Expo, http://bookexpocast.com/?p=12,
was reprinted as ?The End of Authorship? and offered a stinging and somewhat overly
vitriolic rebuke that focused mostly on the high literature end of books.
The best summary of the two I?ve seen comes from Ben Vershbow where he says,
I say it again, it's a shame that Kelly, the
uncritical commercialist, and Updike, the nostaligic elitist, have been the
ones framing the public debate. For most of us, Google is neither the eclipse
nor dawn of authorship, but just a single feature of a shifting landscape. Search
is merely a tool, a means: the books themselves are the end. Yet, neither
Google Book Search, which is simply an apparatus for extracting new profits off
of the transmission and search of books, nor the present-day publishing
industry, dominated as it is by mega-conglomerates with their penchant for
blockbusters (our culture haunted by vast legions of the out-of-print), serves
those ends very well. And yet these are the competing futures of the book:
lonely forts and sparkling clouds. Or so we're told.
Posted by ben vershbow on June 27,
2006 01:47 AM at http://www.futureofthebook.org/blog/archives/2006/06/the_least_interesting_conversa.html
If:Book is a good place to start reading if you are
interested in this sort of thing. It has some thoughtful takes on defining and
thinking about books. For example, http://www.futureofthebook.org/blog/archives/2006/06/what_is_a_book.html
See also, http://www.libraryjournal.com/article/CA6332156.html.
Posted by WARREN, SCOTT
| Jul 10 2006, 09:46:38 AM EDT
| Permalink
|

Thursday June 22, 2006
The Approachable Tablet PC
Last month, I started using an IBM Lenovo Tablet PC at work. Our Head of I.T., another Tablet owner, warned me that Tablets are like magnets. He said that people - strangers - would now stop me everywhere I went, and he was right. I am now stopped consistently in the airport, approached on the street, and interrupted in coffee shops by people interested in the Tablet or, it sometimes seems, just interested in chatting and using the Tablet as an ice-breaker. When I attended the Educause Southeast Regional Conference in Atlanta this week, I had a particularly interesting Tablet-related experience.
I took all my notes for the Conference on the Lenovo. I love the flexibility of stylus-based, handwriting recognition text entry. Now I can draw diagrams in my notes, write in the margins of electronic documents, basically doing anything I can do with pen and paper, and have a digital file as the end product.
So, I attended a large-group discussion on Tuesday and the session was generating lots of input from the participants. Suddenly, the facilitator stopped the conversation mid-stream and asked if someone would volunteer to take notes and capture all the ideas bouncing around.
I am not much of a note-taker, and I confess to looking down at my feet when this request was made. The room was silent for a long time and when I looked up, the entire room was looking at me and smiling. We all started laughing, and the facilitator said 'Would you mind?'
This situation was strange to me because so many other people in the room were taking notes on pen and paper or on their laptops. I didn't know a soul in the room, while many of the other attendees seemed to know each other well. I was participating vocally like everyone else. I could only think that the Tablet made me the unanimous note-taker choice.
I thought maybe people saw me scribbling with my stylus, so it was obvious that I was taking notes already. But, like I said, there were plenty of people writing on pads of paper. Then, I thought, maybe I was singled out because my notes were digital. But, what about all those laptop users in the room? Aren't typewritten notes going to be more legible anyway? Maybe participants assumed that the laptop users were really just websurfing, checking email, or doing other work? Do people just identify the tablet as a specialized 'note-taking' tool - moreso than a legal pad and ballpoint pen?
This phenomenon is interesting to me because librarians often deal with 'approachability' issues at the reference or information desk. The nature of library work now requires some sort of computer at these service desks. However, patrons feel uncomfortable approaching librarians who are working on or sitting at a computer. Maybe librarians need to carry Tablets at these service points instead?
Posted by Joe Williams
| Jun 22 2006, 10:32:50 AM EDT
| Permalink
|

Friday June 16, 2006
Digital, flexible paper
Take a look at http://www.plasticlogic.com/lifeisflexible.php
Pretty neat stuff. What caught my attention are the designs. Good technology alone isn't the solution to e-books. You've got to have usable designs and the bright winners of this contest clearly have good ideas and have thought about the human being actually using the device and situations in which flexible digital paper could be useful.
Posted by WARREN, SCOTT
| Jun 16 2006, 02:41:58 PM EDT
| Permalink
|
She frequently had recourse to digital aid
Happy Bloomsday, everyone. In honor of the occasion, I'd like to refer you to a catechistic passage about Leopold Bloom and his wife Molly from Episode 17 ("Ithaca") of Joyce's Ulysses, available in full online at Project Gutenberg. Kudos to PG, as usual, for the "plain vanilla" text that serves so many purposes so well. There's also a version broken down by episode that was copied from PG, I think, by a resourceful and helpful professor at U Penn.
Which domestic problem as much as, if not more than, any other frequently engaged his mind?
What to do with our wives.
What had been his hypothetical singular solutions?
Parlour games (dominos, halma, tiddledywinks, spilikins, cup and ball, nap, spoil five, bezique, twentyfive, beggar my neighbour, draughts, chess or backgammon): embroidery, darning or knitting for the policeaided clothing society: musical duets, mandoline and guitar, piano and flute, guitar and piano: legal scrivenery or envelope addressing: biweekly visits to variety entertainments: commercial activity as pleasantly commanding and pleasingly obeyed mistress proprietress in a cool dairy shop or warm cigar divan: the clandestine satisfaction of erotic irritation in masculine brothels, state inspected and medically controlled: social visits, at regular infrequent prevented intervals and with regular frequent preventive superintendence, to and from female acquaintances of recognised respectability in the vicinity: courses of evening instruction specially designed to render liberal instruction agreeable.
What instances of deficient mental development in his wife inclined him in favour of the lastmentioned (ninth) solution?
In disoccupied moments she had more than once covered a sheet of paper with signs and hieroglyphics which she stated were Greek and Irish and Hebrew characters. She had interrogated constantly at varying intervals as to the correct method of writing the capital initial of the name of a city in Canada, Quebec. She understood little of political complications, internal, or balance of power, external. In calculating the addenda of bills she frequently had recourse to digital aid. After completion of laconic epistolary compositions she abandoned the implement of calligraphy in the encaustic pigment, exposed to the corrosive action of copperas, green vitriol and nutgall. Unusual polysyllables of foreign origin she interpreted phonetically or by false analogy or by both: metempsychosis (met him pike hoses), ALIAS (a mendacious person mentioned in sacred scripture).
What compensated in the false balance of her intelligence for these and such deficiencies of judgment regarding persons, places and things?
The false apparent parallelism of all perpendicular arms of all balances, proved true by construction. The counterbalance of her proficiency of judgment regarding one person, proved true by experiment.
Posted by Amanda French
| Jun 16 2006, 12:46:50 PM EDT
| Permalink
|
More on the nora project
Yesterday, I was fortunate enough to attend a live demonstration of the nora project at JCDL 2006. As you may recall from Amanda's previous post, the nora project aims to develop tools for detecting patterns in humanities collections. The demo I attended was part of a presentation provocatively titled "Exploring Erotics in Emily Dickinson's Correspondence with Text Mining and Visual Interfaces". The data source for this particular demo was a collection of about 300 letters written by Emily Dickinson to her sister-in-law. The tool is being used to help scholars in the interpretation of literary work. You can actually launch the demo application (click on "Nora Visualization") from the nora project website. Below is a screenshot I took this morning from the downloadable demo tool (with bogus ratings inserted by me for illustration purposes).

In a nutshell, the tool allows you to browse a collection of Emily Dickinson poems, and rate the poem on a 1-5 scale according to some predefined criteria. The criteria in this case was the erotic nature of the poem (red is "hot", black is "not hot"). The user ratings provide a baseline for the text mining algorithm to do it's work of classifying the remaining poems as "hot" or "not hot" using a Naive Bayes algorithm. The predicted "hot" poems are marked in purple. The tool highlights words within the collection that were algorithmically associated with "hotness" and "non hotness", and provides scatterplots for detecting patterns over time.
The great thing about this tool is that it supports open-ended interpretive analysis, not bound to a specific collection or topic dimension. In the future I expect to see text mining tools like this embedded as services in a variety of digital libraries and repositories.
Posted by Tito Sierra
| Jun 16 2006, 12:00:38 PM EDT
| Permalink
|

Thursday June 15, 2006
Librarians and Search Industry
Librarians are sometimes cast as being separate or somehow removed from the IT and Search & Retrieval fields. So, when I was reading this month's Search Engine Report, I was happy to see that Search Engine Watch hires - and really seems to value - librarians ("you know, those human search engines that have helped people for thousands of years").
The Report references one librarian's search-related blog. For a large, international list of other library-related blogs, check out http://www.libdex.com/weblogs.html.
Posted by Joe Williams
| Jun 15 2006, 10:35:30 AM EDT
| Permalink
|
Big
Here is a search engine that takes simple search to a whole new level.
Posted by Tito Sierra
| Jun 15 2006, 09:28:07 AM EDT
| Permalink
|

Wednesday June 14, 2006
Digital time capsules: Zittrain at JCDL2006
Yesterday I attended a fascinating presentation by Jonathan Zittrain at the JCDL 2006 conference. His topic was "Open Information: Redaction, Restriction, and Removal". One problem he posed is how we should deal with retracted or edited information published in the open information environment. Sometimes information is published that is controversial (e.g. Danish newspaper cartoons controversy) or incredibly sensitive (e.g. scholarly articles on how to contaminate the milk supply). Sometimes there is a compelling public interest to redact or retract this information because of its sensitive nature at the present time. For content published in digital form, edits can occur silently, and digital archives can be purged from databases and filesystems. But there can also be a compelling long term interest in preserving controversial content for historical and cultural research.
How do we deal with these competing interests to censor and archive sensitive materials? One idea Zittrain raised is that of an archive encryption key. Rather then destroy censored materials from the public information space one could encrypt it with a key that can only be decrypted at some point in the future. This would function as digital time capsule, allowing scholarly access to sensitive materials at later presumably less sensitive date.
The idea is not without its problems (how to encypt on a time basis? how long to encrypt?), but is interesting nonetheless.
Posted by Tito Sierra
| Jun 14 2006, 11:03:30 AM EDT
| Permalink
|

Monday June 12, 2006
More on Authority & Wikipedia
The topic is not new, but I appreciated this Jaron Lanier article and this editorial by Robert McHenry, which both criticize Wikipedia in terms of editorial authority and voice. Lanier lashes out at the "hive mentality" that he says drives Wikipedia and meta- aggregator websites. Ironically, I found Lanier's article through the Arts & Letters Daily aggregator site - a publication of the Chronicle of Higher Education.
I agree with much of what Lanier says in terms of Wikipedia authority and bias concerns, but I don't agree that multitudes of people are consciously flocking to Wikipedia because they trust and seek out hive-generated information sources. I think Wikipedia's traffic is mostly just an acknowledgement that there is too much information out there and people are trying to simplify their search process. The same with Google. It is much easier to have one place to search for things, one place to look up quick-answer questions. And as I surfed around Wikipedia, I also had to wonder how many of the entries were created entirely through Google searches...
The 'simplification of searching' is one service issue that reference and instruction librarians see daily. From a patron's perspective, why should they have to search in X database for articles on religion and Y database for articles on engineering or psychology books. The trade-off many patrons make, of course, is to use a much simpler search tool like Google and accept (often fewer) results with questionable authority. Very different from choosing Wikipedia because of it's community-authored nature.
[Disclaimer: I do have a library bias, but NCSU's new online catalog by Endeca and Google Scholar searching services take giant steps toward simplifying the research process for our patrons.]
Posted by Joe Williams
| Jun 12 2006, 06:34:03 PM EDT
| Permalink
|

Thursday June 08, 2006
ngc4lib and the localness of catalogs
A few days ago Eric Morgan spun off a new list, ngc4lib, from web4lib. This new list focuses on what a catalog is with the abbreviation
standing for Next Generation Catalogs for Libraries. Subscribe at
LISTSERV-AT-LISTSERV.ND-DOT-EDU. Already some robust discussion has ensued about
defining a catalog and whether such a category as a primary user exists. Some
of the members of the Horseless Library (Tito and myself) have subscribed. I
should add that Eric Morgan once upon a time worked here at NCSU though before my
time so I've never met him.
Sone of the questions being debated are whether a primary user exists for a
given catalog and what makes a catalog unique from other search tools. What is
getting lost in the discussion a bit is the word local. Much is being made of
comparisons to Amazon and other completely public and open tools with some
commentators stating that there is no such thing as a primary user. I disagree.
The easy thing is to state that there are primary local user communities
attached to libraries, be they faculty, staff, and students for an academic
setting like here or residents who reside in a given town for a public library.
If user groups outside of those primary groups benefit from a catalog, that is
nice, but it is certainly not essential. What I think commentators who say
there is no primary user are really arguing for is a type of interface design
that promotes ease of widespread usage - no one needs to be taught how to use
Amazon or shop Wal-Mart.
However, our situation in academic settings is a bit more complex. I argued
yesterday that a catalog can be defined not just by its searching ability, but
by an economic dimension as well. Catalogs delineate what is locally owned or
rented or payed for and hence what some sort of privileged user group actually
has access to once the discovery part of a catalog is done. It's important to
realize - and I forgot to say this on the list - that having access does not
fully equate to immediate availability. Nonetheless, one way I tend to think of
a catalog is proof of ownership which equals some measure of access rights
(without payment) and provides services too as compared to a tool like Amazon
which simply provides proof of publication while also providing some services.
Things may get messier still as catalogs begin incorporating functionality that
Amazon and other Web 2.0 enterprises already embody. In earlier posts, we
discussed reviews and Tito said
"Relating this to an earlier discussion on the ordering of reviews, if our local library catalog included both NCSU submitted reviews, and reviews from a shared pool of user contributed
content from other universities, would it make sense to bias the display of the NCSU submitted reviews over the shared reviews? One can imagine a system that would gracefully degrade from local to global display. Such a bias would be easy to build in, but would it be desirable?
This idea has potential relevance for other types of user contributed content such as book lists, tags, annotations, etc. How important is the local institution in these contexts?"
So I have two questions: Just how local should a catalog be?
And just how important is the localness of a catalog as opposed to the
universality of an Amazon or WorldCat?
Posted by WARREN, SCOTT
| Jun 08 2006, 05:30:57 PM EDT
| Permalink
|

Friday June 02, 2006
Microsoft research in search awards announced
You may have previously heard about the "Accelerating Search in Academic Research Awards" offered by Microsoft to further research in the search field. Well, here are the first 12 winners of these awards.
Posted by Tito Sierra
| Jun 02 2006, 04:45:10 PM EDT
| Permalink
|
|
|

Horseless Library image by Herman Berkhoff
|
| Archives |
|
|
| « July 2006 » | | Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|
| | | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | | | 15 | 16 | 17 | 18 | 19 | | 21 | 22 | 23 | 24 | | 26 | 27 | 28 | 29 | 30 | 31 | | | | | | | Today |
|
|
|
|
|
|
| Links |
|
|
|
|
|
|
|
|