I would LOVE it if the output gave user names instead of user IDs. Often
the data makes me want to investigate the individual stories of
contributors who added a lot of content/made a lot of edits/etc., but
there's no way of doing that with user IDs since I can't convert user IDs
to usernames.
On Tue, Nov 26, 2013 at 2:46 PM, Dario Taraborelli <
dtaraborelli(a)wikimedia.org> wrote:
thanks for the clarification Jaimee – it sounds like
we should consider
adding user_names to the output if this is the main cause of the problem
instead of building functionality at the input to deal with this. Dan, any
thoughts?
BTW this notion of rerunning cohort analysis for members of a previous
cohort who meet specific criteria is a use case that Product/Editor
Engagement is also interested in. We used to call these “generated cohorts”
in the old design plans for UserMetrics and I’d love if we revisited this
feature requests and its relative priority.
D
On Nov 26, 2013, at 2:35 PM, Jaime Anstee <janstee(a)wikimedia.org> wrote:
Missed the question back to me, sorry. Mixed cohorts might occur due to
the output as user IDs while collection is of usernames - say someone has a
repeating events and has a csv output of data for those new users that were
retained at a certain activity level from Point A to B and then has new
cohort members opt in at Point B but only wants to include those that
already survived from Point A and new at Point B cohort members for
examining at another Point C. Without the output of usernames to create
the active Point B cohort separately this would make the Point C cohort a
mix of qualified user ids and new user names. There are several ways of
dealing with this, it was just the first scenario I could think of that
could cause this. Seems we still need to revisit the possibility of
accessing usernames as output, also for reasons of matching to other data
points where most users and most program leaders do not know user ids -
Jaime
--
Jaime Anstee, Ph.D
Program Evaluation Specialist
Wikimedia Foundation
+1.415.839.6885 ext 6869
www.wikimediafoundation.org
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us make it a reality!
*https://donate.wikimedia.org <https://donate.wikimedia.org/>*
On Fri, Nov 22, 2013 at 4:04 PM, Dario Taraborelli <
dtaraborelli(a)wikimedia.org> wrote:
that works for me, thanks!
Jaimee – can you give us more details on the use case for mixed cohorts
that you had in mind?
On Nov 22, 2013, at 3:28 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
So, for now, until I figure out how to fix this,
it will always prefer
user_names before user_ids.
I think this is an argument for making users specifying whether it's
names or ids up front, and not allowing mixtures. Assuming it might be a
mixture and looking for names first is almost certain to produce inaccurate
results at some point. We have ids precisely to avoid collisions with
names, allowing for renaming users, and other cases.
Yep, I just learned this the hard way and made a fool of myself in front
of a bunch of people I admire. So, I'd be glad if I'm the only one that
this happens to. If nobody objects, I'm going to allow the user to select
whether their cohort contains user_ids OR user_names, and strictly prohibit
mixtures.
_______________________________________________
Wikimetrics mailing list
Wikimetrics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
--
LiAnna Davis
Wikipedia Education Program Communications Manager
Wikimedia Foundation