This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset. If one includes positive and negative types of vandalism in training practically the entire text of the history of wikipedia can be used in the training set, possibly creating a remarkable bot.
On Wed, Mar 18, 2009 at 6:34 AM, Andrew Garrett agarrett@wikimedia.org wrote:
I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!
The Abuse Filter is an extension to the MediaWiki [2] software that powers Wikipedia allowing automatic "filters" or "rules" to be run against every edit, and to take actions if any of those rules are triggered. It is designed to combat vandalism which is simple and pattern-based, from blanking pages to complicated evasive page-move vandalism.
We've already seen some pretty cool uses for the Abuse Filter. While there are filters for the obvious personal attacks [3], many of our filters are there just to identify common newbie mistakes such page-blanking [4], give the users a friendly warning [5] and ask them if they really want to submit their edits.
The best part is that these friendly "soft" warning messages seem to work in passively changing user behaviour. Just the suggestion that we frown on page-blanking was enough to stop 56 of the 78 matches [6] of that filter when I checked. If you look closely, you'll even find that many of the users took our advice and redirected the page or did something else more constructive instead.
I'm very pleased at my work being used so well on English Wikipedia, and I'm looking forward to seeing some quality filters in the near future! While at the moment, some of the harsher actions such as blocking are disabled on Wikimedia, we're hoping that the filters developed will be good enough that we can think about activating them in the future.
If anybody has any questions or concerns about the Abuse Filter, feel free to file a bug [7], contact me on IRC (werdna on irc.freenode.net), post on my user talk page, or send me an email at agarrett at wikimedia.org
[1] http://www.mediawiki.org/wiki/Extension:AbuseFilter [2] http://www.mediawiki.org [3] http://en.wikipedia.org/wiki/Special:AbuseFilter/9 [4] http://en.wikipedia.org/wiki/Special:AbuseFilter/3 [5] http://en.wikipedia.org/wiki/MediaWiki:Abusefilter-warning-blanking [6] http://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchFilte... [7] http://bugzilla.wikimedia.org
-- Andrew Garrett
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l