[Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

Fri Aug 28 00:55:07 UTC 2009

On Thu, Aug 27, 2009 at 8:41 PM, Thomas Dalton <thomas.dalton at gmail.com>wrote:

> 2009/8/28 Anthony <wikimail at inbox.org>:
> > On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton <thomas.dalton at gmail.com
> >wrote:
> >
> >> 2009/8/28 Anthony <wikimail at inbox.org>:
> >> >> He means what would you measure in order to draw conclusions about
> the
> >> >> severity of vandalism.
> >> >>
> >> >
> >> > Umm...you would count the number of instances of vandalism?
> >>
> >> That's not practical.
> >
> >
> > I never said it was practical, I just said that counting revisions and
> > calling that "counting vandalism" is incorrect.
>
> And you were asked to suggest a better approach. Nobody claimed it was
> perfect.
>

I suggested a better approach last time we had this thread: statistical
sampling.

And I'm saying much more than that this method is imperfect.  I'm saying
it's fundamentally flawed when it comes to measuring vandalism.  It measures
something much different than vandalism.

When it comes to answering the question of "how likely is one to encounter
vandalism", I am no more informed after reading this thread than before.  It
could be 0.5% and I wouldn't be surprised.  It could be 3% and I wouldn't be
surprised.  The methods used in this study both undercount and overcount
vandalism, possibly quite significantly.  Not all reverts are reverts of
vandalism.  I wouldn't be surprised if only 50% of them are.  And not all
vandalism is reverted.  As "revert" is defined by this method, I wouldn't be
surprised if 75% of vandalism is not detected.  This study doesn't measure
vandalism.