[Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles

Jimmy Wales jwales at wikia-inc.com
Thu Aug 20 16:46:10 UTC 2009


Gregory Maxwell wrote:
> If you were using "is gay" as a measure of vandalism
> over time you might conclude that vandalism is decreasing when in
> reality "cluebot" is performing the same kind of analysis for its
> automatic vandalism suppression and the vandals have responded by
> vandalizing in forms that can't be automatically identified, such as
> by changing dates to incorrect values.

And if that's true, that's on net a bad thing.  Most "is gay" vandalism 
(not all) is just stupid embarassing and it will be obvious to the 
reader as vandalism, and lots of people get how Wikipedia works and are 
reasonably tolerant of seeing that sort of thing from time to time.

But people expect that we should get the dates right, and they are right 
to ask that of us.

I understand that you're just making up a hypothetical, not saying that 
this is what is actually happening.  I'm just agreeing with this line of 
thinking that says, in essence, "when we think about measuring 
vandalism, which is already hard enough, we also have to think about how 
damaging different kinds of vandalism actually are".


Greg, I think your email sounded a little negative at the start, but not 
so much further down.  I think you would join me heartily in being super 
grateful for people doing this kind of analysis.  Yes, some of it will 
be primitive and will suffer from the many difficulties.  But 
data-driven decisionmaking is a great thing, particularly when we are 
cognizant of the limitations of the data we're using.

I just didn't want anyone to get the idea (and I'm sure I'm reading you 
right) that you were opposed to people doing research. :-)

--Jimbo



More information about the foundation-l mailing list