Re: [Foundation-l] Wikisource and reCAPTCHA

30 Jun 2010

On Wed, Jun 30, 2010 at 12:42 PM, Samuel J Klein &lt;sj(a)wikimedia.org&gt; wrote:

...
  < PGDP has a very strict and arduous workflow...
 The
  result is quality, however only the text is sent
downstream. 
 Why not send images and text downstream? 
Because PGDP produces for Project Gutenberg, which publishes text and
html versions, not scans.

...
  Perhaps we have competing interfaces / workflows.  but
I expect we
 would be glad to share 99.99%-verified high-quality
 texts-unified-with-images if it were easy for both projects to
 identify that combination of quality and comprehensive data... and
 would be glad to share metadata so that a WS editor could quickly
 check to see if there's a PGDP effort covering an edition of the text
 she is proofing; and vice-versa. 
For the PGDP side, it's possible to check at PGDP itself (one will
need to get a login for that, but it's as free and unencumbered as the
same on Wikimedia), but there is also a useful superset at
http://www.dprice48.freeserve.co.uk/GutIP.html (warning! I'm talking
of a 7 megabyte html file here). This contains, sorted by author
(books by more than one author given multiple times) all books that
have a clearance for Project Gutenberg.

For cooperation, one idea could be to get the PGDP material either
after the P3 stage or after the F2 stage. As long as a project is
still active, it isn't hard at all to get both the text and the scan
pages.

-- 
André Engels, andreengels(a)gmail.com

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] Wikisource and reCAPTCHA