Delirium wrote:
Lars Aronsson wrote:
So I found and bought "The New Student's Reference Work", a little encyclopedia in five volumes, published in Chicago in 1914. As it was published before 1923, it is now in the public domain. Since this non-Scandinavian work doesn't fit in Project Runeberg, I put it in Wikisource. First I scanned images (300 dpi JPEG) of all 2791 pages and uploaded them to Wikimedia Commons, where you will find them in the category http://commons.wikimedia.org/wiki/Category:LA2-NSRW
Then, for each book page, I created a wiki page on Wikisource displaying the scanned image and containing the raw OCR text. If you want to help in proofreading, use two separate browser windows to open the enlarged image and edit the wiki text.
I don't mean this to sound particularly harsh, but I'm wondering why we're doing this on Wikisource, when Distributed Proofreaders for Project Gutenberg already has a well-debugged workflow for taking texts from images to OCR to proofread to a final version. Is there an advantage to starting our own project that does the same thing they already do pretty well?
I view this event as what Wikisource is all about. Project Runeberg has been doing this sort of thing all along: a stable JPEG image of the original page to insure historical accuracy, and an editable OCR initiated product that can easily be wikified, footnoted or translated as circumstances require. This goes well beyond the capabilities of Project Gutenberg.
Ec