A good machine learning algorithm should try to distinguish categories (good vs bad) within large sub-categories (anon). That's supposed to be one of the advantages over a simple scoring formula—different elements can be weighted differently (even positively vs negatively) depending on the exact combination of features.

It may have been difficult for ORES to make the distinction because the signal within the anonymous sub-group was too noisy. At the end of the relevant section in the video (around 54m30s) Aaron mentions using a grammar to try to parse the edit to take more of the edit's content into account, which is where the real distinction is to be made among anon users.

(The other interesting takeaway from his presentation is that if you want to really vandalize Wikipedia successfully, make an account and then wait 8 years—then you'll be free to do anything!)

Trey Jones

Sr. Software Engineer, Search Platform
Wikimedia Foundation

On Wed, Jan 31, 2018 at 6:00 PM, Stas Malyshev <smalyshev@wikimedia.org> wrote:

Hi!

> We talked about this briefly in our meeting today, i got the links from
> Aaron for the original slides[1] and talk[2] (starts about half way
> through). i don't think we are using any features with as strong of a
> bias as the logged in bit for ores, but still something to think about.

I've noticed when reviewing edits (mostly on Wikidata) that ORES is
alerting a lot on anonymous edits, even benign ones. From one point of
view, it's correct - most of vandalism is done by anonymous accounts -
but from the other, it doesn't help when we need to distinguish
specifically between good and bad anon edits.

I guess the problem is that there's only one score (AFAIK?), in which
anonymous is indeed a strong marker, but we really need to distinguish
good edits from bad edits *inside* anonymous edits set, and algorithm
that lumps them all together, which formally doing pretty good on the
*whole* data set, is not very useful for this purpose. I guess we also
may be encountering this when optimizing all kinds of corner cases,
where our changes may hardly be visible on the whole data set, even if
they make special case better (or worse :).

--
Stas Malyshev
smalyshev@wikimedia.org

_______________________________________________
Discovery mailing list
Discovery@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery