[Foundation-l] How much of Wikipedia is vandalized? 0.4% of Articles

Erik Zachte erikzachte at infodisiac.com
Thu Aug 20 17:23:58 UTC 2009


There is another way to detect 100% reverts. It won't catch manual reverts
that are not 100 accurate but most vandal patrollers will use undo, and the
like.

 

For every revision calculate md5 checksum of content. Then you can easily
look back say 100 revisions to see whether this checksum occurred earlier.
It is efficient and unambiguous.

 

This will work for any Wikipedia for which a full archive dump is available.


 

Erik Zachte

 



More information about the foundation-l mailing list