Good suggestion from Steven:

No hourly reports over a month long, No daily reports over a year long.  Does that seem fair?

Dan


On Sat, Nov 2, 2013 at 12:00 AM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
Hi,

I just noticed someone ran a query from 2012 to 2013 as a timeseries by hour.  This... creates a *lot* of data.  For the cohort they used, it's about 1.8 million pieces of data.  Should we cap report sizes somehow?  It doesn't pose any immediate dangers other than taking up a lot of resources and computation time, as well as IO time spent logging the results (the log is currently acting as rudimentary backup - perhaps this is ill conceived).

In this case it looks like maybe it was a mistake, so one idea is to warn the user that they are about to generate a lot of data, and to ask them to confirm.

Thoughts?

Dan