Felipe Ortega wrote:
I also have my doubts about the filtering conditions. For instance, in eswiki, 'BOTpolicia' is not registered as such and it's responsible for more than 90.000 edits, so far. On the other hand, a famous user in eswiki (retired for this moment, id=13770 to be precise)
He has returned, ~500 edits this week ;)
Filtering by number of edits/hour or similar may require a lot of time/resources, specially in larger Wikipedias, (sorry, but for my thesis I'm mainly focused on the top-ten Wikipedias :) ).
The problem is that here you need the edits *per user*, not per page. I understand from the WikiXRay page that you're recreating the mediawiki tables. It'd just to query each user contributions and check the time difference. With indexes in place, you would get a time good enough.
When it may get terribly slow is if applying to all users, as you would make the algorithm quadratic.