Dan,

I think the warning is important and would be useful for prevention of this type of query as a mistake.  I have seen this almost happen, and with the rate at which Sarah and our interns have been pulling data I know I have heard them wince some at choosing the wrong command at times. Anyway, I support your idea to institute a warning.

Thanks,

Jaime  

-- 

Jaime Anstee, Ph.D
Program Evaluation Specialist
Wikimedia Foundation
+1.415.839.6885 ext 6869
www.wikimediafoundation.org

Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality!
https://donate.wikimedia.org



On Sat, Nov 2, 2013 at 5:38 AM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
Well, Dario, it was actually someone at WMF.  But I don't think that should matter much.  Let's do this as a compromise:

If someone runs an hourly report longer than a month and a daily report longer than a year, we give them a warning telling them what's going to happen.  If they say OK, we have to assume they know what they're doing and they really need the data.

I know I accidentally ran a really long query once, so we'd at least guard against that.  Like I said though, even that crazy long query last night didn't cause any huge problems.  It just used up a bit of memory and slowed access to the wikimetrics server for a few hours.  There are a couple of simple monitoring, tracing, and backup improvements I could make in order to alleviate that as well.  So if it keeps happening despite the warning, I'll just do that.

Dan


On Sat, Nov 2, 2013 at 2:18 AM, Dario Taraborelli <dtaraborelli@wikimedia.org> wrote:
and that’s why we need throttling anyway

On Nov 1, 2013, at 11:17 PM, Dario Taraborelli <dario@wikimedia.org> wrote:

that’s correct, the original plan was to build an API.

On Nov 1, 2013, at 11:08 PM, Steven Walling <swalling@wikimedia.org> wrote:


On Fri, Nov 1, 2013 at 11:02 PM, Dario Taraborelli <dtaraborelli@wikimedia.org> wrote:
The limits sound fair anyway, but I see external researchers (and even community members interested in historical data) using this tool to collect very long data series.

I think that use case is out of scope for Wikimetrics. It's getting dangerously close to using Wikimetrics as a general data platform or service, rather than sticking to getting human-readable results for standardized metrics. It's okay to go back months or years in time, but not simultaneously at a level of detail not interpretable except with further heavy processing of the result. 

--
Steven Walling,
Product Manager
_______________________________________________
Wikimetrics mailing list
Wikimetrics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics



_______________________________________________
Wikimetrics mailing list
Wikimetrics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics



_______________________________________________
Wikimetrics mailing list
Wikimetrics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics