so, assuming that user wasn’t me <kidding>…. how about some kind of throttling for non-WMF users?
The limits sound fair anyway, but I see external researchers (and even community members interested in historical data) using this tool to collect very long data series.
Dario
On Nov 1, 2013, at 9:34 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Good suggestion from Steven:
No hourly reports over a month long, No daily reports over a year long. Does that seem fair?
Dan
On Sat, Nov 2, 2013 at 12:00 AM, Dan Andreescu dandreescu@wikimedia.org wrote: Hi,
I just noticed someone ran a query from 2012 to 2013 as a timeseries by hour. This... creates a *lot* of data. For the cohort they used, it's about 1.8 million pieces of data. Should we cap report sizes somehow? It doesn't pose any immediate dangers other than taking up a lot of resources and computation time, as well as IO time spent logging the results (the log is currently acting as rudimentary backup - perhaps this is ill conceived).
In this case it looks like maybe it was a mistake, so one idea is to warn the user that they are about to generate a lot of data, and to ask them to confirm.
Thoughts?
Dan
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics