On Tue, Sep 3, 2013 at 7:16 PM, Jonathan Morgan <jmorgan@wikimedia.org> wrote:
Hi there analytics,

Wanted to get your input on something. Our grantee partner Jake Orlowitz (cc'd) is planning out the pilot evaluation for The Wikipedia Adventure.

We haven't hammered out the details yet, but it looks like he will be comparing editing behavior between at least 2 cohorts: editors who were invited to play TWA and who completed at least the first mission, and a control group of editors who met the same basic criteria (joined around the same time, met a minimum edit threshold). The TWA pilot will last at least a month, and new editors will be invited on a rolling basis throughout that month. We'd like to examine the editing behavior of each editor AFTER the date they were invited to TWA (or would have been, in the case of the control group). 

Currently, it looks like Wikimetrics only lets you specify a date range at the level of the cohort; that won't work for this analysis, since we want to exclude edits made before a given date, which will vary user-by-user. Could WikiMetrics be updated to allow researchers to set user-level date ranges? I'm thinking potentially this could be an optional field in the upload CSV.

I think this feature would be useful beyond TWA. The current setup works well for offline events, where everyone in a cohort is receiving the same "treatment" at exactly the same time. But for many online initiatives--such as volunteer-driven email and social media campaigns & editor engagement experiments like TWA, Teahouse, Image and translation drives, etc.--the cohorts won't necessarily fit neatly into single-date buckets.

We have a little time to talk this through: the TWA pilot hasn't started yet, and we won't be analyzing data for at least a month and a half, but I wanted to get the conversation started.
Thanks Jonathan! 
Two possible directions come to mind:
1) As Dan suggested, add a 'treatment relative date' box to the UI when creating the report.
2) Interact through the REST-ful API by submitting each individual as a separate cohort and merge the results back into a single dataset. The REST-ful API does not exist yet but we started a Mingle card: https://mingle.corp.wikimedia.org/projects/analytics/cards/1103 (Very empty right now)

My gut feeling is that option 2) is going to be more future proof than option 1). It will be harder and harder to add UI support for increasingly complex treatment scenario's and it will inevitably make the UI feel very cramped. Buuut.... I am curious to hear other people's opinions as well! So folks. please chime in on this request from Jonathan.


Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation

Analytics mailing list