Re: [Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles

20 Aug 2009


      2009/8/20 Gregory Maxwell gmaxwell@gmail.com:
...
Going back to your simple study now:  The analysis of vandalism
duration and its impact on readers makes an assumption about
readership which we know to be invalid. You're assuming a uniform
distribution of readership: That readers are just as likely to read
any random article. But we know that the actual readership follows a
power-law (long-tail) distribution. Because of the failure to consider
traffic levels we can't draw conclusions on how much vandalism readers
are actually exposed to.
We're also assuming a uniform distribution of vandalism, as it were.
There's a number of different types of vandalism; obscene defacement,
malicious alteration of factual content, meaningless test edits of a
character or two, schoolkids leaving messages for each other...
...and it all has a different impact on the reader.
This has two implications:
a) It seems safe to assume that replacing the entire article with
"john is gay" is going to get spotted and reverted faster, on average,
than an edit providing a plausible-sounding but entirely fictional
history for a small town in Kansas. So, any changes in the pattern of
the *content* of vandalism is going to lead to changes in the duration
and thus overall frequency of it, even if the amount of vandal edits
is constant.
b) We can easily compare the difference in effect for vandalism to be
left on differently trafficed pages for various times - roughly
speaking, time * traffic = number of readers affected. If some
vandalism is worse than others, we could thus also calculate some kind
of intensity metric - one hundred people viewing enormous genital
piercing images on [[Kitten]] is probably worse than ten thousand
people viewing "asdfdfggfh" at the end of a paragraph in the same
article.
I'm not sure how we'd go ahead with the second one, but it's an
interesting thing to think about.
-- 
- Andrew Gray
  andrew.gray@dunelm.org.uk

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles