On Thu, Nov 28, 2013 at 3:17 AM, Ori Livneh <ori(a)wikimedia.org> wrote:
It doesn't make sense to do it that way. Instead of inferring that
something must have happened by cross-referencing conditions across
datasets, just do the following: in MediaWiki, every time a user makes an
edit, check their registration date and edit count. If the date is within
the last thirty days and the edit count is 5, log an event. Doing it this
way will easily scale to the entire cluster, not just enwiki, and to any
number of bins, not just 5 edits.
Patch at <https://gerrit.wikimedia.org/r/#/c/98079/>; you can take it
from there if you like.
Thanks Ori - this sounds and looks viable to me, and seems like a better
solution. Kenan, Jon, Dario, Dan, et al - can we move forward with this?
--
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687