[Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles
Mark Wagner
carnildo at gmail.com
Fri Aug 21 01:30:53 UTC 2009
On Thu, Aug 20, 2009 at 14:10, Anthony<wikimail at inbox.org> wrote:
> On Thu, Aug 20, 2009 at 1:55 PM, Nathan <nawrich at gmail.com> wrote:
>>
>> My point (which might still be incorrect, of course) was that an analysis
>> based on 30,000 randomly selected pages was more informative about the
>> English Wikipedia than 100 articles about serving United States Senators.
>
>
> Any automated method of finding vandalism is doomed to failure. I'd say its
> informativeness was precisely zero.
>
> Greg's analysis, on the other hand, was informative, but it was targeted at
> a much different question than Robert's.
>
> "if one chooses a random page from Wikipedia right now, what is the
> probability of receiving a vandalized revision" The best way to answer that
> question would be with a manually processed random sample taken from a
> pre-chosen moment in time. As few as 1000 revisions would probably be
> sufficient, if I know anything about statistics, but I'll let someone with
> more knowledge of statistics verify or refute that. The results will depend
> heavily on one's definition of "vandalism", though.
I did this in an informal fashion in 2005 during my "hundred article"
surveys. Of the 503 pages I looked at, only one was clearly
vandalized the first time I looked at it, so I'd say a thousand
samples is probably too small to get any sort of precision on the
vandalism rate.
--
Mark Wagner
[[User:Carnildo]]
More information about the foundation-l
mailing list