Hi!
We talked about this briefly in our meeting today, i got the links from Aaron for the original slides[1] and talk[2] (starts about half way through). i don't think we are using any features with as strong of a bias as the logged in bit for ores, but still something to think about.
I've noticed when reviewing edits (mostly on Wikidata) that ORES is alerting a lot on anonymous edits, even benign ones. From one point of view, it's correct - most of vandalism is done by anonymous accounts - but from the other, it doesn't help when we need to distinguish specifically between good and bad anon edits.
I guess the problem is that there's only one score (AFAIK?), in which anonymous is indeed a strong marker, but we really need to distinguish good edits from bad edits *inside* anonymous edits set, and algorithm that lumps them all together, which formally doing pretty good on the *whole* data set, is not very useful for this purpose. I guess we also may be encountering this when optimizing all kinds of corner cases, where our changes may hardly be visible on the whole data set, even if they make special case better (or worse :).