[Wikimedia-l] Which Wikipedias have had large scale bot creation of articles this year?

faewik at gmail.com
Wed Nov 27 14:24:37 UTC 2013


On 27 November 2013 13:43, Anders Wennersten <mail at anderswennersten.se> wrote:
...
> And even if this only is relevant for far less then 1% of all generated
> articles it becomes around hundred in total. Many of these cases are quite
> complicated to fix (area of lakes, depths) and there is a debate who should
> fix these, the botowner (who has generated correctly from sources) or
> community people (who have problems finds relevant basedata), or should
> these be deleted or rewritten from scratch?

Small error rates are a real challenge. My experience on Commons for
large bot work has been long discussions around quality complaints
where the level of error was *well below 1%*. The default stance on
the English Wikipedia and Commons is that if you make the mess, then
you need to clean it up.

On the whole, I don't think this is a bad policy, it does however make
bot jobs like this a puzzle to get right, in the case of my Geograph
categorization work, we managed to reduce error levels from below a
known 0.5% to less than a vanishingly small 0.15% (the numbers being
so small it became hard to measure or estimate, so this number is
conservatively pessimistic).

Giving the community easy ways of reporting failures and seeing them
get quickly corrected is a good option. Even better is to run large
uploads as a project team where the three elements of content experts,
bot experts and enthusiastic volunteer editors/re-users are all
represented. It may take longer, but the bot writer is far more likely
to get praised for good work and any occasional problem just absorbed
as part of the project rather than put on the bot-writer's shoulders.

This is how I structured the Airliners project
<https://commons.wikimedia.org/wiki/Commons:Batch_uploading/Airliners>
there are plenty of other interesting examples on the batch upload
project page.

Fae
-- 
faewik at gmail.com http://j.mp/faewm



More information about the Wikimedia-l mailing list