[Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles

Nathan nawrich at gmail.com
Thu Aug 20 17:14:58 UTC 2009


On Thu, Aug 20, 2009 at 12:59 PM, Gregory Kohs <thekohser at gmail.com> wrote:

> While the time and effort that went into Robert Rohde's analysis is
> certainly extensive, the outcomes are based on so many flawed assumptions
> about the nature of vandalism and vandalism reversion, publicize at one's
> peril the key "finding" of a 0.4% vandalism rate.
>
>
> http://en.wikipedia.org/w/index.php?title=John_McCain&diff=169808394&oldid=169720853
> 11 hours
> Reverted with no tags.
>

The best part about that little exchange is:
http://en.wikipedia.org/w/index.php?title=John_McCain&diff=next&oldid=169906715

wherein a revert was made returning the vandalism, followed by another when
the editor noticed his error.

I don't think Robert made any firm conclusions on the meaning of his data;
he notes all the caveats that others have since emphasized, and admits to
likely underestimating vandalism. I read the 0.4% as representing the
approximate number of articles containing vandalism in an English Wikipedia
snapshot; that is quite different than the amount of time specific articles
stay in a "vandalized" state. Given the difficulty of accurately analyzing
this sort of data, no firm conclusions can be drawn; but certainly its more
informative than a Wikipedia Review analysis of a relatively small group of
articles in a specific topic area.

Nathan



More information about the wikimedia-l mailing list