[Foundation-l] Open Library, Wikisource, and cleaning and translating OCR of Classics

Samuel Klein meta.sj at gmail.com
Tue Aug 11 05:00:14 UTC 2009


Lars,

I think we agree on what needs to happen.  The only thing I am not
sure of is where you would like to see the work take place.   I have
raised versions of this issue with the Open Library list, which I copy
again here (along with the people I know who work on that fine project
- hello, Peter and Rebecca).  This is why I listed it below as a good
group to collaborate with.

However, the project I have in mind for OCR cleaning and translation needs to
 - accept public comments and annotation about the substance or use of
a work (the wiki covering their millions of metadata entries is very
low traffic and used mainly to address metadata issues in their
records)
 - handle OCR as editable content, or translations of same
 - provide a universal ID for a work, with which comments and
translations can be associated (see
https://blueprints.launchpad.net/openlibrary/+spec/global-work-ids)
 - handle citations, with the possibility of developing something like WikiCite

Let's take a practical example.  A classics professor I know (Greg
Crane, copied here) has scans of primary source materials, some with
approximate or hand-polished OCR, waiting to be uploaded and converted
into a useful online resource for editors, translators, and
classicists around the world.

Where should he and his students post that material?

Wherever they end up, the primary article about each article would
surely link out to the OL and WS pages for each work (where one
exists).


> (Plus you would have to motivate why a copy of OpenLibrary should
> go into the English Wikisource and not the German or French one.)

I don't understand what you mean -- English source materials and
metadata go on en:ws, German on de:ws, &c.  How is this different from
what happens today?

SJ


On Mon, Aug 3, 2009 at 1:18 PM, Lars Aronsson<lars at aronsson.se> wrote:
> Samuel Klein wrote (in two messages):
>
>> >> *A wiki for book metadata, with an entry for every published
>> >> work, statistics about its use and siblings, and discussion
>> >> about its usefulness as a citation (a collaboration with
>> >> OpenLibrary, merging WikiCite ideas)
>
>> I could see this happening on Wikisource.
>
> Why could you not see this happening within the existing
> OpenLibrary? Is there anything wrong with that project? It sounds
> to me as you would just copy (fork) all their book data, but for
> what gain?
>
> (Plus you would have to motivate why a copy of OpenLibrary should
> go into the English Wikisource and not the German or French one.)
>
>
> --
>  Lars Aronsson (lars at aronsson.se)
>  Aronsson Datateknik - http://aronsson.se
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



More information about the foundation-l mailing list