Heh, besides the fact that we don't have threshold, this ran on wikimetrics in like 10 seconds:

https://metrics.wmflabs.org/reports/result/c06d3de0-060f-4434-9bd1-ba5606e15dfc.json

Which confirms my suspicion that it's not the mediawiki tables that are problematic, but the cross-db join.


On Thu, Nov 21, 2013 at 6:10 PM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
OK, so to sum this up:

* wikimetrics won't help us yet because timeseries isn't implemented for threshold
* the query is just friggin slow, and I don't think there's anything we can do about it
* SILVER LINING: with this new metric definition, we don't have to re-compute the entire history every day.  So my proposal is to do this daily:

1. dump that day's worth of data into a temp table in enwiki_p
2. join that to revision_userindex and run the query I wrote
3. concatenate the results manually, maybe with some simple python

It's a bitch, but it's all we can do until we tackle this problem the right way (big data, data warehouse, etc.)


On Thu, Nov 21, 2013 at 5:48 PM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
The metric in WikiMetrics could be used for this, but the cohort would have to be re-generated all the time.  I implemented the new logic that Kenan described.  The problem is, this query runs crazy slow on the slaves:


I'm gonna try generating a huge cohort, upload it to wikimetrics and see where I get.


On Thu, Nov 21, 2013 at 5:28 PM, Toby Negrin <tnegrin@wikimedia.org> wrote:
I confirmed with Kenan that the metric is similar to the activation metric used in WikiMetrics.

-Toby


On Thu, Nov 21, 2013 at 2:07 PM, Kenan Wang <kwang@wikimedia.org> wrote:
Dario and I chatted about this. The original purpose of this dashboard was to be more of an acquisition dashboard. Thus to fit that requirement Dario suggested that we use an activation metric. 

i.e. How many users in month X reached 5 edits within 30 days.

Is that possible?


On Wed, Nov 20, 2013 at 12:05 PM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
On Wed, Nov 20, 2013 at 11:34 AM, Kenan Wang <kwang@wikimedia.org> wrote:
That would be good. It is suppose to be in line with the active editor metric.

Wait, so it *is* supposed to be like the active editor metric?  To be very clear, we have two possible questions we can be answering here, and only one of them is like the active editor metric.  Both of these questions will show results for X <- {each of the last 12 months}

1. How many people, of those that created their account on mobile, made at least 5 edits in month X?  [THIS IS LIKE ACTIVE EDITOR]

2. How many people created their account on mobile in month X, and made at least 5 edits since then?

I've implemented both, but there is a pretty big difference so we need to figure that out before moving on.

Dan





--

Kenan Wang
Product Manager, Mobile
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics