[Foundation-l] Wikisource and reCAPTCHA

Andre Engels andreengels at gmail.com
Wed Jun 30 12:18:13 UTC 2010


On Wed, Jun 30, 2010 at 1:24 PM, John Vandenberg <jayvdb at gmail.com> wrote:

> Good question! ;-)
> Storage is one issue.
> It would be interesting to estimate the storage requirements of
> Wikisource if we had produced the PGDP etexts.

I think it is the main reason; however, a back-of-the-envelope
calculation (20.000 books, 300 pages, 100k per page; the first is
quite a good estimate, the other two could be a factor 2 off) tells me
that the total storage requirements would be measured in 100s of
gigabytes - which means that one or two state of the art hard disks
should be enough to contain it.

> They don't have an 'export' function, and I doubt they are going to
> build one so that they can interoperate with us.
>
> My 'import' function was a scraper; not something that can be used in
> a large scale without their permission.

On the other hand, if you _do_ get permission, there might well be a
more elegant ftp-based method.

> The wikisource workflow is a *symptom* of it being a "wiki", with all
> that entails.  There is a lot more than merely the workflow which
> distinguishes the two projects.

Certainly. I think the deeper-laying difference is one of attitude,
which as you write is for WS a symptom of being a wiki. As a wiki, WS
uses such attitudes/principles as "make it easy for people to
contribute", "publish early, publish often", "let people do what they
want, as long as it's a step, however small forward". PGDP on the
other hand derives its attitudes/principles from a wish to create high
quality end products. As such it uses "check and doublecheck", "limit
the amount of projects we work on", "quality control" and "division of
tasks".


-- 
André Engels, andreengels at gmail.com



More information about the foundation-l mailing list