Felipe Ortega wrote:
I also have my doubts about the filtering conditions.
For
instance, in eswiki, 'BOTpolicia' is not registered as such
and it's responsible for more than 90.000 edits, so far. On
the other hand, a famous user in eswiki (retired for this
moment, id=13770 to be precise)
He has returned, ~500 edits this week ;)
Filtering by number of edits/hour or similar may
require
a lot of time/resources, specially in larger Wikipedias,
(sorry, but for my thesis I'm mainly focused on the top-ten
Wikipedias :) ).
The problem is that here you need the edits *per user*, not per page.
I understand from the WikiXRay page that you're recreating the mediawiki
tables. It'd just to query each user contributions and check the time
difference.
With indexes in place, you would get a time good enough.
When it may get terribly slow is if applying to all users, as you would
make the algorithm quadratic.