When do you think we might realistically see a feature like this in
Wikimetrics? As Jonathan mentions, it is going to be something we need for
most online projects - Jake's will be a good first test case in the next 2
months, but I don't see myself or most of the IEGrantees being able to get
the full benefit of the tool until we can go beyond single-date buckets.
On Wed, Sep 4, 2013 at 10:31 AM, Dario Taraborelli <
great input, a few thoughts:
• I support Diederik's option 2. I'm worried not only of UX complexity but
also of the impact of this change on the data model. What if we want to
represent multiple intervals for the same user? Or different intervals for
the same user as a function of the project? Passing these intervals as
parameters when requesting a metric for individual users sounds like the
best solution. Single-user time ranges via the
an option we have in UserMetrics (I can set you up to use the private
instance running on the SQL slaves, if you want to play with it), I agree
we should have the same functionality ported to Wikimetrics.
• Another possible solution is to look at time series. Wikimetrics will
have timeseries support at cohort level, but in principle the same
breakdown by day or week or month could apply to individual responses. We
don't have a request for this AFAIK (Diederik, can you confirm?). This
would allow you to have single-user series even if the date range is only
set at the cohort level.
• More generally, the current approach to cohorts in Wikimetrics makes
them – by design – private to the cohort owner: this was a request
originating from Grantmaking/Program Evaluation and it makes perfect sense
in that context. However, this is different from the original idea of a
treatment repository, which would allow us to control for confounds or for
the joint effect of multiple treatments when running the analysis. The
latter is a very common use case in Product/Editor engagement.
On Sep 4, 2013, at 9:38 AM, Diederik van Liere <dvanliere(a)wikimedia.org>
On Tue, Sep 3, 2013 at 7:16 PM, Jonathan Morgan <jmorgan(a)wikimedia.org>wrote;wrote:
Hi there analytics,
Wanted to get your input on something. Our grantee partner Jake Orlowitz
(cc'd) is planning out the pilot evaluation for The Wikipedia Adventure.
We haven't hammered out the details yet, but it looks like he will be
comparing editing behavior between at least 2 cohorts: editors who were
invited to play TWA and who completed at least the first mission, and a
control group of editors who met the same basic criteria (joined around the
same time, met a minimum edit threshold). The TWA pilot will last at least
a month, and new editors will be invited on a rolling basis throughout that
month. We'd like to examine the editing behavior of each editor AFTER the
date they were invited to TWA (or would have been, in the case of the
Currently, it looks like Wikimetrics only lets you specify a date range
at the level of the cohort; that won't work for this analysis, since we
want to exclude edits made before a given date, which will vary
user-by-user. Could WikiMetrics be updated to allow researchers to set
user-level date ranges? I'm thinking potentially this could be an optional
field in the upload CSV.
I think this feature would be useful beyond TWA. The current setup works
well for offline events, where everyone in a cohort is receiving the same
"treatment" at exactly the same time. But for many online initiatives--such
as volunteer-driven email and social media campaigns & editor engagement
experiments like TWA, Teahouse, Image and translation drives, etc.--the
cohorts won't necessarily fit neatly into single-date buckets.
We have a little time to talk this through: the TWA pilot hasn't started
yet, and we won't be analyzing data for *at least* a month and a half,
but I wanted to get the conversation started.
Two possible directions come to mind:
1) As Dan suggested, add a 'treatment relative date' box to the UI when
creating the report.
2) Interact through the REST-ful API by submitting each individual as a
separate cohort and merge the results back into a single dataset. The
REST-ful API does not exist yet but we started a Mingle card:
empty right now)
My gut feeling is that option 2) is going to be more future proof than
option 1). It will be harder and harder to add UI support for increasingly
complex treatment scenario's and it will inevitably make the UI feel very
cramped. Buuut.... I am curious to hear other people's opinions as well! So
folks. please chime in on this request from Jonathan.
Jonathan T. Morgan
Analytics mailing list
Analytics mailing list
Wikimedia Foundation, Inc.
*Imagine a world in which every single human being can freely share in the
sum of all knowledge. *
*Donate <https://donate.wikimedia.org> or click the "edit" button today,
and help us make it a reality!*