[Foundation-l] Wikisource and reCAPTCHA

Andre Engels andreengels at gmail.com
Wed Jun 30 12:18:13 UTC 2010

On Wed, Jun 30, 2010 at 1:24 PM, John Vandenberg <jayvdb at gmail.com> wrote:

> Good question! ;-)
> Storage is one issue.
> It would be interesting to estimate the storage requirements of
> Wikisource if we had produced the PGDP etexts.

I think it is the main reason; however, a back-of-the-envelope
calculation (20.000 books, 300 pages, 100k per page; the first is
quite a good estimate, the other two could be a factor 2 off) tells me
that the total storage requirements would be measured in 100s of
gigabytes - which means that one or two state of the art hard disks
should be enough to contain it.

> They don't have an 'export' function, and I doubt they are going to
> build one so that they can interoperate with us.
> My 'import' function was a scraper; not something that can be used in
> a large scale without their permission.

On the other hand, if you _do_ get permission, there might well be a
more elegant ftp-based method.

> The wikisource workflow is a *symptom* of it being a "wiki", with all
> that entails.  There is a lot more than merely the workflow which
> distinguishes the two projects.

Certainly. I think the deeper-laying difference is one of attitude,
which as you write is for WS a symptom of being a wiki. As a wiki, WS
uses such attitudes/principles as "make it easy for people to
contribute", "publish early, publish often", "let people do what they
want, as long as it's a step, however small forward". PGDP on the
other hand derives its attitudes/principles from a wish to create high
quality end products. As such it uses "check and doublecheck", "limit
the amount of projects we work on", "quality control" and "division of

André Engels, andreengels at gmail.com

More information about the wikimedia-l mailing list