[Foundation-l] Wikisource and reCAPTCHA

Samuel J Klein sj at wikimedia.org
Wed Jun 30 10:42:27 UTC 2010

On Wed, Jun 30, 2010 at 6:13 AM, John Vandenberg <jayvdb at gmail.com> wrote:
> irrespective of whether it is verified, OCR
> quality, or if it is vandalism.  However, wikisource keeps the images
> and the text unified from day 0 to eternity.

Some works become verified, and reach high OCR quality.

< PGDP has a very strict and arduous workflow...  The
> result is quality, however only the text is sent downstream.

Why not send images and text downstream?

> Wikisource and PGDP don't interoperate.  We *could*, but when I looked
> at importing a PGDP project into Wikisource, I put it in the too hard basket.

That's what I mean by 'coordinate'.  "hard" here seems like a one-time
hardship followed by a permanent useful coordination.

> Wikisource is trying to become a credible competitor to PGDP.

Perhaps we have competing interfaces / workflows.  but I expect we
would be glad to share 99.99%-verified high-quality
texts-unified-with-images if it were easy for both projects to
identify that combination of quality and comprehensive data... and
would be glad to share metadata so that a WS editor could quickly
check to see if there's a PGDP effort covering an edition of the text
she is proofing; and vice-versa.

I want us to get better, faster, less held up by the idea of
coordinating with other projects, because there are much larger
projects out there worthy of coordinating with.  The annotators who
work on the Perseus Project come to mind... but that's truly a harder
problem than this one.

> If the Wikisource projects succeeds in
> demonstrating the wiki way is a viable approach, the result is
> different people choosing to work in different workflows/projects, and
> more reliable etexts being produced.



More information about the wikimedia-l mailing list