Hi Jonathan,

great input, a few thoughts:

I support Diederik's option 2. I'm worried not only of UX complexity but also of the impact of this change on the data model. What if we want to represent multiple intervals for the same user? Or different intervals for the same user as a function of the project? Passing these intervals as parameters when requesting a metric for individual users sounds like the best solution. Single-user time ranges via the API is an option we have in UserMetrics (I can set you up to use the private instance running on the SQL slaves, if you want to play with it), I agree we should have the same functionality ported to Wikimetrics.

Another possible solution is to look at time series. Wikimetrics will have timeseries support at cohort level, but in principle the same breakdown by day or week or month could apply to individual responses. We don't have a request for this AFAIK (Diederik, can you confirm?). This would allow you to have single-user series even if the date range is only set at the cohort level.

More generally, the current approach to cohorts in Wikimetrics makes them by design private to the cohort owner: this was a request originating from Grantmaking/Program Evaluation and it makes perfect sense in that context. However, this is different from the original idea of a treatment repository, which would allow us to control for confounds or for the joint effect of multiple treatments when running the analysis. The latter is a very common use case in Product/Editor engagement.


On Sep 4, 2013, at 9:38 AM, Diederik van Liere <dvanliere@wikimedia.org> wrote:

On Tue, Sep 3, 2013 at 7:16 PM, Jonathan Morgan <jmorgan@wikimedia.org> wrote:
Hi there analytics,

Wanted to get your input on something. Our grantee partner Jake Orlowitz (cc'd) is planning out the pilot evaluation for The Wikipedia Adventure.

We haven't hammered out the details yet, but it looks like he will be comparing editing behavior between at least 2 cohorts: editors who were invited to play TWA and who completed at least the first mission, and a control group of editors who met the same basic criteria (joined around the same time, met a minimum edit threshold). The TWA pilot will last at least a month, and new editors will be invited on a rolling basis throughout that month. We'd like to examine the editing behavior of each editor AFTER the date they were invited to TWA (or would have been, in the case of the control group). 

Currently, it looks like Wikimetrics only lets you specify a date range at the level of the cohort; that won't work for this analysis, since we want to exclude edits made before a given date, which will vary user-by-user. Could WikiMetrics be updated to allow researchers to set user-level date ranges? I'm thinking potentially this could be an optional field in the upload CSV.

I think this feature would be useful beyond TWA. The current setup works well for offline events, where everyone in a cohort is receiving the same "treatment" at exactly the same time. But for many online initiatives--such as volunteer-driven email and social media campaigns & editor engagement experiments like TWA, Teahouse, Image and translation drives, etc.--the cohorts won't necessarily fit neatly into single-date buckets.

We have a little time to talk this through: the TWA pilot hasn't started yet, and we won't be analyzing data for at least a month and a half, but I wanted to get the conversation started.
Thanks Jonathan! 
Two possible directions come to mind:
1) As Dan suggested, add a 'treatment relative date' box to the UI when creating the report.
2) Interact through the REST-ful API by submitting each individual as a separate cohort and merge the results back into a single dataset. The REST-ful API does not exist yet but we started a Mingle card: https://mingle.corp.wikimedia.org/projects/analytics/cards/1103 (Very empty right now)

My gut feeling is that option 2) is going to be more future proof than option 1). It will be harder and harder to add UI support for increasingly complex treatment scenario's and it will inevitably make the UI feel very cramped. Buuut.... I am curious to hear other people's opinions as well! So folks. please chime in on this request from Jonathan.


Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation

Analytics mailing list

Analytics mailing list