[Foundation-l] [ol-discuss] Open Library, Wikisource, and cleaning and translating OCR of Classics

John Vandenberg jayvdb at gmail.com
Tue Aug 11 10:36:15 UTC 2009


On Tue, Aug 11, 2009 at 6:21 PM, Magnus
Manske<magnusmanske at googlemail.com> wrote:
> I see CC-NC...
>
> http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3atext%3a2003.02.0004
>
> Too bad.

Well, they can't copyright what is in the PD.

There is little about the XML in TEI format that can be called
"creative", and any non-factual markup can be easily stripped out.

I remember now ... it was in March/April 2008 that I was looking at
this, for the Pindar odes, and a djvu with pagescans is on
archive.org.

http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0101
http://www.archive.org/details/olympianpythiano00pinduoft

The Perseus etext doesnt appear to include the 125 pages which have
the complete Greek texts.
(btw, here is our unverified original source:
http://el.wikisource.org/wiki/Ολυμπιόνικοι )

However the commentary is all there, with pagination in the TEI so it
is easy to marry the text with the images.

(warning: 850kb xml file, followed by medium res. image)
http://www.perseus.tufts.edu/hopper/xmlchunk?doc=Perseus%3Atext%3A1999.04.0101%3Atext%3Dcomm%3Abook%3DO.
http://www.archive.org/stream/olympianpythiano00pinduoft#page/124/mode/2up

--
John Vandenberg




More information about the wikimedia-l mailing list