This extension is very important for training machine learning
vandalism detection bots. Recently published systems use only hundreds
of examples of vandalism in training - not nearly enough to
distinguish between the variety found in Wikipedia or generalize to
new, unseen forms of vandalism. A large set of human created rules
could be run against all previous edits in order to create a massive
vandalism dataset. If one includes positive and negative types of
vandalism in training practically the entire text of the history of
wikipedia can be used in the training set, possibly creating a
remarkable bot.
On Wed, Mar 18, 2009 at 6:34 AM, Andrew Garrett <agarrett(a)wikimedia.org> wrote:
I am pleased to announce that the Abuse Filter [1] has
been activated
on English Wikipedia!
The Abuse Filter is an extension to the MediaWiki [2] software that
powers Wikipedia allowing automatic "filters" or "rules" to be run
against every edit, and to take actions if any of those rules are
triggered. It is designed to combat vandalism which is simple and
pattern-based, from blanking pages to complicated evasive page-move
vandalism.
We've already seen some pretty cool uses for the Abuse Filter. While
there are filters for the obvious personal attacks [3], many of our
filters are there just to identify common newbie mistakes such
page-blanking [4], give the users a friendly warning [5] and ask them
if they really want to submit their edits.
The best part is that these friendly "soft" warning messages seem to
work in passively changing user behaviour. Just the suggestion that we
frown on page-blanking was enough to stop 56 of the 78 matches [6] of
that filter when I checked. If you look closely, you'll even find that
many of the users took our advice and redirected the page or did
something else more constructive instead.
I'm very pleased at my work being used so well on English Wikipedia,
and I'm looking forward to seeing some quality filters in the near
future! While at the moment, some of the harsher actions such as
blocking are disabled on Wikimedia, we're hoping that the filters
developed will be good enough that we can think about activating them
in the future.
If anybody has any questions or concerns about the Abuse Filter, feel
free to file a bug [7], contact me on IRC (werdna on
irc.freenode.net), post on my user talk page, or send me an email at
agarrett at
wikimedia.org
[1]
http://www.mediawiki.org/wiki/Extension:AbuseFilter
[2]
http://www.mediawiki.org
[3]
http://en.wikipedia.org/wiki/Special:AbuseFilter/9
[4]
http://en.wikipedia.org/wiki/Special:AbuseFilter/3
[5]
http://en.wikipedia.org/wiki/MediaWiki:Abusefilter-warning-blanking
[6]
http://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchFilt…
[7]
http://bugzilla.wikimedia.org
--
Andrew Garrett
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l