Keeping a copy to wikisource-l. Yann
-------- Original Message -------- Subject: Re: [Foundation-l] Open Library, Wikisource, and cleaning and translating OCR of Classics Date: Thu, 13 Aug 2009 01:48:37 -0400
DGG, I appreciate your points. Would we be so motivated by this thread if it weren't a complex problem?
The fact that all of this is quite new, and that there are so many unknowns and gray areas, actually makes me consider it more likely that a body of wikimedians, experienced with their own form of large-scale authority file coordination, are in a position to say something meaningful about how to achieve something similar for tens of millions of metadata records.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
In some areas that is certainly so. In others, Wikimedia communities have useful recent experience. I hope that those who understand these problems on both sides recognize the importance of sharing what they know openly -- and showing others how to understand them as well. We will not succeed as a global community if we say that this class of problems can only be solved by the limited group of people with an MLS and a few years of focused training. (how would you name the sort of training you mean here, btw?)
SJ
On Thu, Aug 13, 2009 at 12:57 AM, David Goodmandgoodmanny@gmail.com wrote:
Yann & Sam
The problem is extraordinarily complex. A database of all "books" (and other media) ever published is beyond the joint capabilities of everyone interested. There are intermediate entities between "books" and "works", and important subordinate entities, such as "article" , "chapter" , and those like "poem" which could be at any of several levels. This is not a job for amateurs, unless they are prepared to first learn the actual standards of bibliographic description for different types of material, and to at least recognize the inter-relationships, and the many undefined areas. At research libraries, one allows a few years of training for a newcomer with just a MLS degree to work with a small subset of this. I have thirty years of experience in related areas of librarianship, and I know only enough to be aware of the problems. For an introduction to the current state of this, see http://www.rdaonline.org/constituencyreview/Phase1Chp17_11_2_08.pdf.
The difficulty of merging the many thousands of partial correct and incorrect sources of available data typically requires the manual resolution of each of the tens of millions of instances.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
wikisource-l@lists.wikimedia.org