Thanks Diederik -- is there a card for this specific request?


On Tue, Sep 10, 2013 at 4:40 PM, Siko Bouterse <> wrote:

Thanks for the info, Diederik, helpful to hear the update of where in the pipeline this falls.

We won't hold our breath but look forward to the great epic of the future :)

On Sep 10, 2013 4:22 PM, "Diederik van Liere" <> wrote:
Hey Siko,

My initial thought is that this is part of a bigger request (epic) called 'Cohort Management' (you don't need to read that :)

Our first goal with Wikimetrics is to reach feature parity (  with UMAPI and so that will take precedence over this request. 

The analytics team is going to plan the next epics soon and we will definitely bring Cohort Management to the table to talk about the priority. Obviously, the more stakeholders ask for this the more likely it is that we will commit to developing this.

Does that sound good?


On Tue, Sep 10, 2013 at 9:04 AM, Siko Bouterse <> wrote:
Hey guys,
When do you think we might realistically see a feature like this in Wikimetrics?  As Jonathan mentions, it is going to be something we need for most online projects - Jake's will be a good first test case in the next 2 months, but I don't see myself or most of the IEGrantees being able to get the full benefit of the tool until we can go beyond single-date buckets.  

On Wed, Sep 4, 2013 at 10:31 AM, Dario Taraborelli <> wrote:
Hi Jonathan,

great input, a few thoughts:

• I support Diederik's option 2. I'm worried not only of UX complexity but also of the impact of this change on the data model. What if we want to represent multiple intervals for the same user? Or different intervals for the same user as a function of the project? Passing these intervals as parameters when requesting a metric for individual users sounds like the best solution. Single-user time ranges via the API is an option we have in UserMetrics (I can set you up to use the private instance running on the SQL slaves, if you want to play with it), I agree we should have the same functionality ported to Wikimetrics.

• Another possible solution is to look at time series. Wikimetrics will have timeseries support at cohort level, but in principle the same breakdown by day or week or month could apply to individual responses. We don't have a request for this AFAIK (Diederik, can you confirm?). This would allow you to have single-user series even if the date range is only set at the cohort level.

• More generally, the current approach to cohorts in Wikimetrics makes them – by design – private to the cohort owner: this was a request originating from Grantmaking/Program Evaluation and it makes perfect sense in that context. However, this is different from the original idea of a treatment repository, which would allow us to control for confounds or for the joint effect of multiple treatments when running the analysis. The latter is a very common use case in Product/Editor engagement.


On Sep 4, 2013, at 9:38 AM, Diederik van Liere <> wrote:

On Tue, Sep 3, 2013 at 7:16 PM, Jonathan Morgan <> wrote:
Hi there analytics,

Wanted to get your input on something. Our grantee partner Jake Orlowitz (cc'd) is planning out the pilot evaluation for The Wikipedia Adventure.

We haven't hammered out the details yet, but it looks like he will be comparing editing behavior between at least 2 cohorts: editors who were invited to play TWA and who completed at least the first mission, and a control group of editors who met the same basic criteria (joined around the same time, met a minimum edit threshold). The TWA pilot will last at least a month, and new editors will be invited on a rolling basis throughout that month. We'd like to examine the editing behavior of each editor AFTER the date they were invited to TWA (or would have been, in the case of the control group). 

Currently, it looks like Wikimetrics only lets you specify a date range at the level of the cohort; that won't work for this analysis, since we want to exclude edits made before a given date, which will vary user-by-user. Could WikiMetrics be updated to allow researchers to set user-level date ranges? I'm thinking potentially this could be an optional field in the upload CSV.

I think this feature would be useful beyond TWA. The current setup works well for offline events, where everyone in a cohort is receiving the same "treatment" at exactly the same time. But for many online initiatives--such as volunteer-driven email and social media campaigns & editor engagement experiments like TWA, Teahouse, Image and translation drives, etc.--the cohorts won't necessarily fit neatly into single-date buckets.

We have a little time to talk this through: the TWA pilot hasn't started yet, and we won't be analyzing data for at least a month and a half, but I wanted to get the conversation started.
Thanks Jonathan! 
Two possible directions come to mind:
1) As Dan suggested, add a 'treatment relative date' box to the UI when creating the report.
2) Interact through the REST-ful API by submitting each individual as a separate cohort and merge the results back into a single dataset. The REST-ful API does not exist yet but we started a Mingle card: (Very empty right now)

My gut feeling is that option 2) is going to be more future proof than option 1). It will be harder and harder to add UI support for increasingly complex treatment scenario's and it will inevitably make the UI feel very cramped. Buuut.... I am curious to hear other people's opinions as well! So folks. please chime in on this request from Jonathan.


Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation

Analytics mailing list

Analytics mailing list

Siko Bouterse
Wikimedia Foundation, Inc.

Imagine a world in which every single human being can freely share in the sum of all knowledge. 
Donate or click the "edit" button today, and help us make it a reality!

Analytics mailing list