[Toolserver-l] Estimating the incidence of vandalism on a popular Wikipedia article

Tony Sidaway f.crdfa at gmail.com
Sun Dec 18 18:10:32 UTC 2005


The program presented here is simple and should be easy to translate
into Perl, C or any other reasonable computer language.

A basic knowledge of lisp list processing operators (car, cdr, etc) is
all that is required to understand this program, which is written in
the Guile dialect of Scheme.

The program analyzes the revision table and uses vandalism reverts as
a proxy for vandalism.  This relies on the assumption that vandalism
is quickly detected and corrected by reverting to an earlier version,
and also assumes that administrators do not abuse the rollback
facility to perform non-vandalism reverts.  It further assumes that
editors do not incorrectly label edit warring as vandalism.  These
assumptions are broadly valid for the article chosen, [[en:George W.
Bush]], but may not be true for all popular articles.

The program is configured to work in English, but it may be possible
to apply the same methods in other languages by changing the match
patterns for vandalism reverts.

Multibyte characters may break this program.  Sorry, it isn't my area
of expertise.

http://en.wikipedia.org/wiki/User:Tony_Sidaway/Dubya_vandalism



More information about the Toolserver-l mailing list