Abuse Filter extension activated on English Wikipedia

List overview All Threads
Download

newer

older

Getting the list of Page Titles...

Serving as xhtml+xml

Andrew Garrett

18 Mar 2009 18 Mar '09

4:34 p.m.

I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

The Abuse Filter is an extension to the MediaWiki [2] software that powers Wikipedia allowing automatic "filters" or "rules" to be run against every edit, and to take actions if any of those rules are triggered. It is designed to combat vandalism which is simple and pattern-based, from blanking pages to complicated evasive page-move vandalism.

We've already seen some pretty cool uses for the Abuse Filter. While there are filters for the obvious personal attacks [3], many of our filters are there just to identify common newbie mistakes such page-blanking [4], give the users a friendly warning [5] and ask them if they really want to submit their edits.

The best part is that these friendly "soft" warning messages seem to work in passively changing user behaviour. Just the suggestion that we frown on page-blanking was enough to stop 56 of the 78 matches [6] of that filter when I checked. If you look closely, you'll even find that many of the users took our advice and redirected the page or did something else more constructive instead.

I'm very pleased at my work being used so well on English Wikipedia, and I'm looking forward to seeing some quality filters in the near future! While at the moment, some of the harsher actions such as blocking are disabled on Wikimedia, we're hoping that the filters developed will be good enough that we can think about activating them in the future.

If anybody has any questions or concerns about the Abuse Filter, feel free to file a bug [7], contact me on IRC (werdna on irc.freenode.net), post on my user talk page, or send me an email at agarrett at wikimedia.org

[1] http://www.mediawiki.org/wiki/Extension:AbuseFilter [2] http://www.mediawiki.org [3] http://en.wikipedia.org/wiki/Special:AbuseFilter/9 [4] http://en.wikipedia.org/wiki/Special:AbuseFilter/3 [5] http://en.wikipedia.org/wiki/MediaWiki:Abusefilter-warning-blanking [6] http://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchFilte... [7] http://bugzilla.wikimedia.org

-- Andrew Garrett

Show replies by date

Brion Vibber

18 Mar 18 Mar

11:43 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On 3/18/09 5:34 AM, Andrew Garrett wrote:

...

I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

I've temporarily disabled it as we're seeing some performance problems saving edits at peak time today. Need to make sure there's functional per-filter profiling before re-enabling so we can confirm if one of the 55 active filters (!) is particularly bad or if we need to do overall optimization.

-- brion

Robert Rohde

11:49 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Wed, Mar 18, 2009 at 12:43 PM, Brion Vibber brion@wikimedia.org wrote:

...

On 3/18/09 5:34 AM, Andrew Garrett wrote:

...
I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

I've temporarily disabled it as we're seeing some performance problems saving edits at peak time today. Need to make sure there's functional per-filter profiling before re-enabling so we can confirm if one of the 55 active filters (!) is particularly bad or if we need to do overall optimization.

For a 45 minute window one specific filter was timing out the server every time someone try to save a large page like WP:AN/I.

We found and disabled that one, but more detailed load stats would definitely be useful.

-Robert Rohde

Tim Starling

11:59 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

Brion Vibber wrote:

...

On 3/18/09 5:34 AM, Andrew Garrett wrote:

...
I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

I've temporarily disabled it as we're seeing some performance problems saving edits at peak time today. Need to make sure there's functional per-filter profiling before re-enabling so we can confirm if one of the 55 active filters (!) is particularly bad or if we need to do overall optimization.

Done, took less than five minutes. Re-enabled.

We're still profiling at ~700ms CPU time per page save, with no particular rule dominant. Disabling 20 of them would help.

-- Tim Starling

Robert Rohde

19 Mar 19 Mar

12:07 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Wed, Mar 18, 2009 at 12:59 PM, Tim Starling tstarling@wikimedia.org wrote:

...

Brion Vibber wrote:

...
On 3/18/09 5:34 AM, Andrew Garrett wrote:

...
I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

I've temporarily disabled it as we're seeing some performance problems saving edits at peak time today. Need to make sure there's functional per-filter profiling before re-enabling so we can confirm if one of the 55 active filters (!) is particularly bad or if we need to do overall optimization.

Done, took less than five minutes. Re-enabled.

We're still profiling at ~700ms CPU time per page save, with no particular rule dominant. Disabling 20 of them would help.

For Andrew or anyone else that knows, can we assume that the filter is smart enough that if the first part of an AND clause fails then the other parts don't run (or similarly if the first part of an OR succeeds)? If so, we can probably optimize rules by doing easy checks first before complex ones.

-Robert Rohde

Tim Starling

12:37 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Robert Rohde wrote:

...

For Andrew or anyone else that knows, can we assume that the filter is smart enough that if the first part of an AND clause fails then the other parts don't run (or similarly if the first part of an OR succeeds)? If so, we can probably optimize rules by doing easy checks first before complex ones.

No, everything will be evaluated.

Note that the problem with rule 48 was that added_links triggers a complete parse of the pre-edit page text. It could be replaced by a check against the externallinks table. No amount of clever shortcut evaluation would have made it fast.

-- Tim Starling

Platonides

4:54 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Tim Starling wrote:

...

Robert Rohde wrote:

...
For Andrew or anyone else that knows, can we assume that the filter is smart enough that if the first part of an AND clause fails then the other parts don't run (or similarly if the first part of an OR succeeds)? If so, we can probably optimize rules by doing easy checks first before complex ones.

No, everything will be evaluated.

Note that the problem with rule 48 was that added_links triggers a complete parse of the pre-edit page text. It could be replaced by a check against the externallinks table. No amount of clever shortcut evaluation would have made it fast.

-- Tim Starling

With branch optimization, placing the check !("autoconfirmed" in USER_GROUPS) and namespace at the beginning would avoid checking the added_links at all (and thus the parse).

Another option could be to automatically optimize based on the cost of each rule.

PS: Why there isn't a link to Special:AbuseFilter/history/$id on the filter view?

Andrew Garrett

7 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Tim Starling wrote:

...

Robert Rohde wrote:

...
For Andrew or anyone else that knows, can we assume that the filter is smart enough that if the first part of an AND clause fails then the other parts don't run (or similarly if the first part of an OR succeeds)? If so, we can probably optimize rules by doing easy checks first before complex ones.

No, everything will be evaluated.

I've written and deployed branch optimisation code, which reduced run-time by about one third.

...

...
Note that the problem with rule 48 was that added_links triggers a complete parse of the pre-edit page text. It could be replaced by a check against the externallinks table. No amount of clever shortcut evaluation would have made it fast.

I've fixed this to use the DB instead for that particular context.

On Thu, Mar 19, 2009 at 11:54 AM, Platonides Platonides@gmail.com wrote:

...

PS: Why there isn't a link to Special:AbuseFilter/history/$id on the filter view?

There is.

I've disabled a filter or two which were taking well in excess of 150ms to run, and seemed to be targetted at specific vandals, without any hits. The culprit seemed to be running about 20 regexes to determine if an IP is in a particular range, where one call to ip_in_range would suffice. Of course, this is also a documentation issue which I'm working on.

To help a bit more with performance, I've also added a profiler within the interface itself. Hopefully this will encourage self-policing with regard to filter performance.

-- Andrew Garrett

Robert Rohde

7:27 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett andrew@epstone.net wrote: <snip>

...

I've disabled a filter or two which were taking well in excess of 150ms to run, and seemed to be targetted at specific vandals, without any hits. The culprit seemed to be running about 20 regexes to determine if an IP is in a particular range, where one call to ip_in_range would suffice. Of course, this is also a documentation issue which I'm working on.

<snip>

ip_in_range rmwhitespace rmspecials ? : if then else end contains

and probably some others appear in SVN but not in the dropdown list that I assume most people are using to locate options.

-Robert Rohde

Brion Vibber

7:59 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Mar 18, 2009, at 20:00, Andrew Garrett andrew@epstone.net wrote:

...

...
To help a bit more with performance, I've also added a profiler within the interface itself. Hopefully this will encourage self-policing with regard to filter performance.

Awesome!

Maybe we could use that for templates too ... ;)

-- Brion

...

-- Andrew Garrett

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Robert Rohde

9:02 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett andrew@epstone.net wrote: <snip>

...

To help a bit more with performance, I've also added a profiler within the interface itself. Hopefully this will encourage self-policing with regard to filter performance.

Based on personal observations, the self-profiling is quite noisy. Sometimes a filter will report one value (say 5 ms) only to come back 5 minutes later and see the same filter report a value 20 times larger, and a few minutes after that it jumps back down.

Assuming that this behavior is a result of variations in the filter workload (and not some sort of profiling bug), it would be useful if you could increase the profiling window to better average over those fluctuations. Right now it is hard to tell which rules are slow or not because the numbers aren't very stable.

-Robert Rohde

Alex

11:21 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

Robert Rohde wrote:

...

On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett andrew@epstone.net wrote:

<snip> > To help a bit more with performance, I've also added a profiler within > the interface itself. Hopefully this will encourage self-policing with > regard to filter performance.

Based on personal observations, the self-profiling is quite noisy. Sometimes a filter will report one value (say 5 ms) only to come back 5 minutes later and see the same filter report a value 20 times larger, and a few minutes after that it jumps back down.

Assuming that this behavior is a result of variations in the filter workload (and not some sort of profiling bug), it would be useful if you could increase the profiling window to better average over those fluctuations. Right now it is hard to tell which rules are slow or not because the numbers aren't very stable.

Yes, in one filter (filter 32) I've been watching, it was taking 90-120ms for what seemed like simple checks (action, editcount, difference in bytes), so I moved the editcount check last, in case it had to pull that from the DB. The time dropped to ~3ms, but a couple hours later with no changes to the order and its up to 20ms.

Related to this: It would be nice if there was a chart or something comparing how expensive certain variables and functions are.

-- Alex (wikipedia:en:User:Mr.Z-man)

Brion Vibber

20 Mar 20 Mar

12:21 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

On 3/19/09 12:21 PM, Alex wrote:

...

Yes, in one filter (filter 32) I've been watching, it was taking 90-120ms for what seemed like simple checks (action, editcount, difference in bytes), so I moved the editcount check last, in case it had to pull that from the DB. The time dropped to ~3ms, but a couple hours later with no changes to the order and its up to 20ms.

Well, a couple notes here:

The runtime of a filter will depend on what it's filtering -- large pages or pages with lots of links are more likely to take longer.

It probably makes sense to give some min/max/mean/average times or something... and a plot over time might be very helpful as well to help filter out (or show up!) outliers.

-- brion

Platonides

3:13 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Andrew Garrett wrote:

...

On Thu, Mar 19, 2009 at 11:54 AM, Platonides Platonides@gmail.com wrote:

...
PS: Why there isn't a link to Special:AbuseFilter/history/$id on the filter view?

There is.

Oops. I was looking for it on the top bar, not at the bottom. I stay corrected.

Brion Vibber

19 Mar 19 Mar

12:14 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

On 3/18/09 12:59 PM, Tim Starling wrote:

...

Brion Vibber wrote:

...
On 3/18/09 5:34 AM, Andrew Garrett wrote:

...
I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

I've temporarily disabled it as we're seeing some performance problems saving edits at peak time today. Need to make sure there's functional per-filter profiling before re-enabling so we can confirm if one of the 55 active filters (!) is particularly bad or if we need to do overall optimization.

Done, took less than five minutes. Re-enabled.

We're still profiling at ~700ms CPU time per page save, with no particular rule dominant. Disabling 20 of them would help.

Not bad for a first production pass on the madness that is enwiki! :D

-- brion

Brian

12:50 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset. If one includes positive and negative types of vandalism in training practically the entire text of the history of wikipedia can be used in the training set, possibly creating a remarkable bot.

On Wed, Mar 18, 2009 at 6:34 AM, Andrew Garrett agarrett@wikimedia.org wrote:

...

I am pleased to announce that the Abuse Filter [1] has been activated on English Wikipedia!

The Abuse Filter is an extension to the MediaWiki [2] software that powers Wikipedia allowing automatic "filters" or "rules" to be run against every edit, and to take actions if any of those rules are triggered. It is designed to combat vandalism which is simple and pattern-based, from blanking pages to complicated evasive page-move vandalism.

We've already seen some pretty cool uses for the Abuse Filter. While there are filters for the obvious personal attacks [3], many of our filters are there just to identify common newbie mistakes such page-blanking [4], give the users a friendly warning [5] and ask them if they really want to submit their edits.

The best part is that these friendly "soft" warning messages seem to work in passively changing user behaviour. Just the suggestion that we frown on page-blanking was enough to stop 56 of the 78 matches [6] of that filter when I checked. If you look closely, you'll even find that many of the users took our advice and redirected the page or did something else more constructive instead.

I'm very pleased at my work being used so well on English Wikipedia, and I'm looking forward to seeing some quality filters in the near future! While at the moment, some of the harsher actions such as blocking are disabled on Wikimedia, we're hoping that the filters developed will be good enough that we can think about activating them in the future.

If anybody has any questions or concerns about the Abuse Filter, feel free to file a bug [7], contact me on IRC (werdna on irc.freenode.net), post on my user talk page, or send me an email at agarrett at wikimedia.org

[1] http://www.mediawiki.org/wiki/Extension:AbuseFilter [2] http://www.mediawiki.org [3] http://en.wikipedia.org/wiki/Special:AbuseFilter/9 [4] http://en.wikipedia.org/wiki/Special:AbuseFilter/3 [5] http://en.wikipedia.org/wiki/MediaWiki:Abusefilter-warning-blanking [6] http://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchFilte... [7] http://bugzilla.wikimedia.org

-- Andrew Garrett

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Delirium

4:03 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

Brian wrote:

...

This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset.

As a machine-learning person, this seems like a somewhat problematic idea--- generating training examples *from a rule set* and then learning on them is just a very roundabout way of reconstructing that rule set. What you really want is a large dataset of human-labeled examples of vandalism / non-vandalism that *can't* currently be distinguished reliably by rules, so you can throw a machine-learning algorithm at the problem of trying to come up with some.

-Mark

Tei

4:15 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Thu, Mar 19, 2009 at 1:03 PM, Delirium delirium@hackish.org wrote:

...

Brian wrote:

...
This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset.

As a machine-learning person, this seems like a somewhat problematic idea--- generating training examples *from a rule set* and then learning on them is just a very roundabout way of reconstructing that rule set. What you really want is a large dataset of human-labeled examples of vandalism / non-vandalism that *can't* currently be distinguished reliably by rules, so you can throw a machine-learning algorithm at the problem of trying to come up with some.

since theres already a database, this sounds like could be done flagging edits as "vandalism", and then reading the existing database information to extract these details, like ip, a diff of the change, etc.. that way, humans define what is a "vandalism", and the machine can learn the meaning.

this may need a button or something, so users report this, and the database flag the edit

-- -- ℱin del ℳensaje.

Brion Vibber

8:52 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

On 3/19/09 5:15 AM, Tei wrote:

...

since theres already a database, this sounds like could be done flagging edits as "vandalism", and then reading the existing database information to extract these details, like ip, a diff of the change, etc.. that way, humans define what is a "vandalism", and the machine can learn the meaning.

this may need a button or something, so users report this, and the database flag the edit

*nod*

Part of the infrastructure for AbuseFilter was adding a tag marker system for edits and log entries, so filters can tag an event as potentially needing more review.

(This is different from say Flagged Revisions, which attempts to mark up a version of a page as having a certain overall state -- it's a *page* thing; here individual actions can be tagged based only on their own internal changes, so similar *events* happening anywhere can be called up in a search for human review.)

It would definitely be useful to allow readers to provide similar feedback, much as many photo and video sharing sites allow visitors to flag something as 'inappropriate' which puts it into a queue for admins to look at more closely.

So far we don't have a manual tagging interface (and the tag-filtering views are disabled pending some query fixes), but the infrastructure is laid in. :)

-- brion

Soxred93

9:14 p.m.

New subject: Abuse Filter extension activated on English Wikipedia

Cobi (owner of ClueBot) and his roomate Crispy have already been working hard to make this specific dataset, but they've been hurt by not enough contributors. The page is here: http://en.wikipedia.org/ wiki/User:Crispy1989#New_Dataset_Contribution_Interface

On Mar 19, 2009, at 8:15 AM [Mar 19, 2009 ], Tei wrote:

...

On Thu, Mar 19, 2009 at 1:03 PM, Delirium delirium@hackish.org wrote:

...
Brian wrote:

...
This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset.

As a machine-learning person, this seems like a somewhat problematic idea--- generating training examples *from a rule set* and then learning on them is just a very roundabout way of reconstructing that rule set. What you really want is a large dataset of human-labeled examples of vandalism / non-vandalism that *can't* currently be distinguished reliably by rules, so you can throw a machine-learning algorithm at the problem of trying to come up with some.

since theres already a database, this sounds like could be done flagging edits as "vandalism", and then reading the existing database information to extract these details, like ip, a diff of the change, etc.. that way, humans define what is a "vandalism", and the machine can learn the meaning.

this may need a button or something, so users report this, and the database flag the edit

--

ℱin del ℳensaje. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Brian

20 Mar 20 Mar

12:30 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

I presented a talk at Wikimania 2007 that espoused the virtues of combining human measures of content with automatically determined measures in order to generalize to unseen instances. Unfortunately all those Wikimania talks seem to have been lost. It was related to this article on predicting the quality ratings provided by the Wikipedia Editorial Team:

Rassbach, L., Pincock, T., Mingus., B (2007). "Exploring the Feasibility of Automatically Rating Online Article Quality" http://upload.wikimedia.org/wikipedia/wikimania2007/d/d3/RassbachPincockMing...

Delerium, you do make it sound as if merely having the tagged dataset solves the entire problem. But there are really multiple problems. One is learning to classify what you have been told is in the dataset (e.g., that all instances of this rule in the edit history *really are* vandalism). The other is learning about new reasons that this edit is vandalism based on all the other occurences of vandalism and non-vandalism and a sophisticated pre-parse of all the content that breaks it down into natural language features. Finally, you then wish to use this system to bootstrap a vandalism detection system that can generalize to entirely new instances of vandalism.

The primary way of doing this is to use positive and *negative* examples of vandalism in conjunction with their features. A good set of example features is an article or an edit's conformance with the Wikipedia Manual of Style. I never implemented the entire MoS, but I did do quite a bit of it and it is quite indicative of quality.

Generally speaking, it is not true that you can only draw conclusions about what is immediately available in your dataset. It is true that, with the exception of people, machine learning systems struggle with generalization.

On Thu, Mar 19, 2009 at 6:03 AM, Delirium delirium@hackish.org wrote:

...

Brian wrote:

...
This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset.

As a machine-learning person, this seems like a somewhat problematic idea--- generating training examples *from a rule set* and then learning on them is just a very roundabout way of reconstructing that rule set. What you really want is a large dataset of human-labeled examples of vandalism / non-vandalism that *can't* currently be distinguished reliably by rules, so you can throw a machine-learning algorithm at the problem of trying to come up with some.

-Mark

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Brian

12:56 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

I just wanted to be really clear about what I mean as a specific counter-example to this just being an example of "reconstructing that rule set." Suppose you use the AbuseFilter rules on the entire history of the wiki in order to generate a dataset of positive and negative examples of vandalism edits. You should then *throw the rules away* and attempt to discover features that separate the vandalism into classes correctly, more or less in the blind.

The key then is feature discovery and a machine system has the potential to do this is a more effective way than a human in virtue of its ability to read the entire encyclopedia.

On Thu, Mar 19, 2009 at 2:30 PM, Brian Brian.Mingus@colorado.edu wrote:

...

I presented a talk at Wikimania 2007 that espoused the virtues of combining human measures of content with automatically determined measures in order to generalize to unseen instances. Unfortunately all those Wikimania talks seem to have been lost. It was related to this article on predicting the quality ratings provided by the Wikipedia Editorial Team:

Rassbach, L., Pincock, T., Mingus., B (2007). "Exploring the Feasibility of Automatically Rating Online Article Quality" http://upload.wikimedia.org/wikipedia/wikimania2007/d/d3/RassbachPincockMing...

Delerium, you do make it sound as if merely having the tagged dataset solves the entire problem. But there are really multiple problems. One is learning to classify what you have been told is in the dataset (e.g., that all instances of this rule in the edit history *really are* vandalism). The other is learning about new reasons that this edit is vandalism based on all the other occurences of vandalism and non-vandalism and a sophisticated pre-parse of all the content that breaks it down into natural language features. Finally, you then wish to use this system to bootstrap a vandalism detection system that can generalize to entirely new instances of vandalism.

The primary way of doing this is to use positive and *negative* examples of vandalism in conjunction with their features. A good set of example features is an article or an edit's conformance with the Wikipedia Manual of Style. I never implemented the entire MoS, but I did do quite a bit of it and it is quite indicative of quality.

Generally speaking, it is not true that you can only draw conclusions about what is immediately available in your dataset. It is true that, with the exception of people, machine learning systems struggle with generalization.

On Thu, Mar 19, 2009 at 6:03 AM, Delirium delirium@hackish.org wrote:

...
Brian wrote:

...
This extension is very important for training machine learning vandalism detection bots. Recently published systems use only hundreds of examples of vandalism in training - not nearly enough to distinguish between the variety found in Wikipedia or generalize to new, unseen forms of vandalism. A large set of human created rules could be run against all previous edits in order to create a massive vandalism dataset.

As a machine-learning person, this seems like a somewhat problematic idea--- generating training examples *from a rule set* and then learning on them is just a very roundabout way of reconstructing that rule set. What you really want is a large dataset of human-labeled examples of vandalism / non-vandalism that *can't* currently be distinguished reliably by rules, so you can throw a machine-learning algorithm at the problem of trying to come up with some.

-Mark

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Delirium

1:17 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Brian wrote:

...

I just wanted to be really clear about what I mean as a specific counter-example to this just being an example of "reconstructing that rule set." Suppose you use the AbuseFilter rules on the entire history of the wiki in order to generate a dataset of positive and negative examples of vandalism edits. You should then *throw the rules away* and attempt to discover features that separate the vandalism into classes correctly, more or less in the blind.

That's precisely the case where you're attempting to reconstruct the original rule set (or some work-alike). If you had positive and negative examples that were actually "known good" examples of edits that really are vandalism, and really aren't vandalism, then yes you could turn loose an algorithm to generalize over them to discover a discriminator between the "is vandalism" and "isn't vandalism" classes. But if your labels are from the output of the existing AbuseFilter, then your training classes are really "is flagged by the AbuseFilter" and "is not flagged by the AbuseFilter", and any machine-learning algorithm will try to generalize the examples in a way that discriminates *those* classes. To the extent the AbuseFilter actually does flag vandalism accurately, you'll learn a concept approximating that of vandalism. But to the extent it doesn't (e.g. if it systematically mis-labels certain kinds of edits), you'll learn the same flaws.

That might not be useless--- you might recover a more concise rule set that replicates the original performance. But if your training data is the output of the previous rule set, you aren't going to be able to *improve* on its performance without some additional information (or built-in inductive bias).

-Mark

Brian

1:26 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Ultimately we need a system that integrates information from multiple sources, such as WikiTrust, AbuseFilter and the Wikipedia Editorial Team.

A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language and the subtelty of certain types of abuse. A system with access to natural language features (and wikitext features) could theoretically detect them. My quality research group considered including features relating to the [[Thematic relation]]s found in an article (we have access to a thematic role parser) which could potentially be used to detect bad writing - indicative of the edit containing vandalism.

On Thu, Mar 19, 2009 at 3:17 PM, Delirium delirium@hackish.org wrote:

...

But if your training data is the output of the previous rule set, you aren't going to be able to *improve* on its performance without some additional information (or built-in inductive bias).

-Mark

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Aryeh Gregor

1:29 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

On Thu, Mar 19, 2009 at 5:26 PM, Brian Brian.Mingus@colorado.edu wrote:

...

A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language and the subtelty of certain types of abuse. A system with access to natural language features (and wikitext features) could theoretically detect them.

And how poorly would *that* perform, if the current AbuseFilter already has performance problems? :)

David Gerard

2:01 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

2009/3/19 Aryeh Gregor Simetrical+wikilist@gmail.com:

...

On Thu, Mar 19, 2009 at 5:26 PM, Brian Brian.Mingus@colorado.edu wrote:

...

...
A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language and the subtelty of certain types of abuse. A system with access to natural language features (and wikitext features) could theoretically detect them.

...

And how poorly would *that* perform, if the current AbuseFilter already has performance problems? :)

Research box, toolserver cluster! :-D

- d.

Delirium

1:11 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

Brian wrote:

...

Delerium, you do make it sound as if merely having the tagged dataset solves the entire problem. But there are really multiple problems. One is learning to classify what you have been told is in the dataset (e.g., that all instances of this rule in the edit history *really are* vandalism). The other is learning about new reasons that this edit is vandalism based on all the other occurences of vandalism and non-vandalism and a sophisticated pre-parse of all the content that breaks it down into natural language features. Finally, you then wish to use this system to bootstrap a vandalism detection system that can generalize to entirely new instances of vandalism.

Generally speaking, it is not true that you can only draw conclusions about what is immediately available in your dataset. It is true that, with the exception of people, machine learning systems struggle with generalization.

My point is mainly that using the *results* of an automated rule system as *input* to a machine-learning algorithm won't constitute training on "vandalism", but on "what the current rule set considers vandalism". I don't see a particularly good reason to find new reasons an edit is vandalism for edits that we already correctly predict. What we want is new discriminators for edits we *don't* correctly predict. And for those, you can't use the labels-given-by-the-current rules as the training data, since if the current rule set produces false positives, those are now positives in your training set; and if the rule set has false negatives, those are now negatives in your training set.

I suppose it could be used for proposing hypotheses to human discriminators. For example, you can propose new feature X, if you find that 95% of the time the existing rule set flags edits with feature X as vandalism, and by human inspection determine that the remaining 5% were false negatives, so actually feature X should be a new "this is vandalism" feature. But you need that human inspection--- you can't automatically discriminate between rules that improve the filter set's performance and rules that decrease it if your labeled data set is the one with the mistakes in it.

-Mark

jidanni＠jidanni.org

19 Mar 19 Mar

12:51 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

AG> frown on page-blanking

For now I just stop them on my wikis with $wgSpamRegex=array('/^\B$/'); I haven't tried fancier solutions yet.

Soxred93

3:28 a.m.

New subject: Abuse Filter extension activated on English Wikipedia

However, that simply disallows them all. On enwiki, the blanking filter warns the user, and lets them go through with it after confirmation.

On Mar 18, 2009, at 4:51 PM [Mar 18, 2009 ], jidanni@jidanni.org wrote:

...

AG> frown on page-blanking

For now I just stop them on my wikis with $wgSpamRegex=array('/^\B$/'); I haven't tried fancier solutions yet.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

5771

Age (days ago)

5772

Last active (days ago)

wikitech-l@lists.wikimedia.org

28 comments

14 participants

tags (0)

participants (14)

Alex
Andrew Garrett
Andrew Garrett
Aryeh Gregor
Brian
Brion Vibber
David Gerard
Delirium
jidanni＠jidanni.org
Platonides
Robert Rohde
Soxred93
Tei
Tim Starling