Neil Harris usenet at tonal.clara.co.uk
Fri Aug 9 15:06:06 UTC 2002

Andre Engels wrote:

>My opinion is that automatic uploading of such texts is a bad thing, but for
>slightly different reasons. My problem is that automatic uploading brings 'the
>good, the bad and the ugly'. Some EBD articles form a good basis for a
>Wikipedia article, others are of little use (I don't think we want the EBD
>entry on Egypt, nor do we want an article from a 1911 encyclopedia on
>Ljubljana, stating that it is a city in the Austria-Hungarian empire). Someone
>who wants to upload material from a source like the EBD should him/herself
>make the selection as to which entries are interesting for Wikipedia, not
>brush that responsibility onto the community after the fact or upload
>everything just because it is there.
>Andre Engels
Ok, there seems to be a consensus in the most recent comments that this 
is a now a bad thing, so I've stopped uploading the articles. I think 
there are some interesting lessons to be learned.

* auto-uploading is a mixed blessing
* some of the articles were rather interesting
* some were dross
* automatic content filtering was only partially successful
* uploading rapidly is bad: Wikipedia's NPOV processes can only 
assimilate a certain rate of addition of added content
* it seems to have stimulated a fair bit of editing of non-uploaded 
articles: creating a sort of "theme day" for Wikipedia
* auto-Wikification is more difficult than it at first seems

This was a pilot for possible larger projects. Given the response, I 
think I need to take a different approach: I didn't anticipate the load 
on the community's goodwill.

The PD NIH disease factsheets are the ideal target for this sort of 
work: Easton's was a sort of dry run for the idea.

How about, as was suggested before by someone else, a new namespace: 
something like "PDresource:" that can be used by auto-uploaders to get 
PD info into a Wikified format without polluting the main namespace, but 
can then be used as source material for the 'pedia by human contributors.

That way, the effort can be split: the natural thing to do with 
auto-uploading _is_ massive data-dumps at a time, and this would benefit 
the human contributors by getting a lot of text into a suitable format, 
ready for human selection.


