Maarten,

the use case of "generated cohorts" you refer to (cohorts created incrementally by running queries against the DB) is a high priority for data analysis at WMF too. We need to have automatically generated/updated cohorts for classes of users like "all mobile registrations", "commons uploaders", "VE first-time editors" etc.  

In UserMetrics we used to call these cohorts "dynamic cohorts" (as opposed to fixed-membership or "static cohorts"). We haven't sorted out yet how these are going to be implemented in Wikimetrics, but this is definitely on our radar. A related idea that we're currently discussing is to expose Wikimetrics functionality via APIs that would allow authenticated clients to programmatically generate reports for arbitrarily defined list of users.

Dario

On Sep 7, 2013, at 8:51 AM, Maarten Dammers <maarten@mdammers.nl> wrote:

Hi Diederik,

Op 7-9-2013 15:14, Diederik van Liere schreef:
Hey Maarten,
thanks for the feedback! replies inside.
You're welcome. I hope this helps to improve the tooling.

On Sat, Sep 7, 2013 at 8:14 AM, Maarten Dammers <maarten@mdammers.nl> wrote:
Hi everyone,

Tried Wikimetric today and it looks like a good start to me. Some feedback:
* Google/Twitter account, should be something WMF like the labs/Gerrit LDAP
Yes, we will migrate to Mediawiki OAuth once it's stable. 
* Should use https by default
Agree 
* O wait, invalid certificate, filed bug at https://bugzilla.wikimedia.org/show_bug.cgi?id=53892
Yes, because WMF has not yet figured out a policy for SSL certificates in Labs.
Where is the ball in this case?

* Only English? It should be multilingual like all our software. The people at translatewiki will be happy to translate for you
Yes, but rather wait with that until we have reached a stable version 1.0 but tracking at https://mingle.corp.wikimedia.org/projects/analytics/cards/1143
This is the reply I've seen with many many projects before you. Always something like "yes, we're going to add it, but first.....". Just add it and don't worry about maybe people doing one or two extra translations because you changed some part of the code.
* Upload csv user lists is not very convenient. Are you planning to come up with a easier/better system?
What is not convenient?
I have to compile a file in some ancient ill documented format. What I would like to have: Automatically generated cohorts. Let's take me on the English Wikipedia (https://en.wikipedia.org/wiki/User:Multichill).
* I want to have a cohort for all users who are in a certain category (for example https://en.wikipedia.org/wiki/Category:Wikimedia_Commons_administrators)
* I want to have a cohort for all users who use a certain template (for example https://en.wikipedia.org/w/index.php?title=Special%3AWhatLinksHere&target=Template%3AUser+SUL&namespace=2)
* I want to have a cohort for all users who are linked from a certain page (for example https://en.wikipedia.org/wiki/Wikipedia:WikiProject_National_Register_of_Historic_Places#Members)
* I want to have a cohort for all users who have edited a certain page (for example https://en.wikipedia.org/w/index.php?title=History_of_Arsenal_F.C._%281886%E2%80%931966%29&action=history)
These are simple queries, you could probably make up some more. You could also go more complex by combining aspects, for example users who edited a page in a certain category. These queries might explode.
From a privacy point of view I might not even see what users are part of the cohort, just the end results.

* Project "en" is a bit weird. You're probably using <project>wiki_p for the database. Can you add a link to available projects? Or how to construct it? Say for example I want the German Wikivoyage.
Yes, some explanation on how to construct it would be useful.
Please expand at https://www.mediawiki.org/wiki/Wikimetrics/FAQ#What_is_the_project_code.3F
* Description seems to be missing for some fields at http://metrics.wmflabs.org/metrics/
Which fields in particular?
The fields where description is empty? So that's Start Date, End Date, Positive Only Sum, Negative Only Sum, Absolute Sum and Net Sum

* You could probably grab namespaces on the fly from the Mediawiki api
Sure, but why? we do have a validation step that should verify whether the report you want to run is valid.
Because most people wouldn't know the namespace numbers and would have to look them up. Do you know the id of the campaign namespace on Commons?
* Can you add an option to give output per time period (month would be nice)?
We are working on roll-up of results, should be released shortly. 
* Can you add bytes uploaded as a metric?
 Created https://mingle.corp.wikimedia.org/projects/analytics/cards/1141; I can't promise this because there are more urgent metrics that we would like to implement first.
* Can you split out the result per namespace?
* http://metrics.wmflabs.org/support contains a to the empty page http://www.mediawiki.org/wiki/Wikimetrics/FAQ . Can you make that link https by default?
Sure, and please help us in creating the FAQ :)
Created the page.
* Where is the code? Can we submit new metrics? See for example http://toolserver.org/~reports/?wiki=nl.wikipedia.org for a similar service
* Are you planning to offer some visual output besides csv/json? See for example https://toolserver.org/~emijrp/wlm/stats.php
I would love to see integration with Limn but we have not yet made commitments to do so. The output format should be general enough to make it easy to use for a visualizations. 

* I see you have sql queries. What tables are available? All (non-private) tables like on the Toolserver and Toollabs?
We are querying the labsdb databases, so all tables in those databases are available. 
* Do you have some metrics on the usage of wikimetrics? :-)
Not yet :(
Ooooh, the shame ;-)

Op 7-9-2013 16:57, Dan Andreescu schreef:
For anyone interested in contributing code, chat me up over email or in IRC and we can get started.  I've made a start at writing up a tutorial here: https://github.com/wikimedia/analytics-wikimetrics#development-environment, let me know how I can improve it or just submit a patch :)
+1 on making documentation, -2 for the location. Documentation should *always* be on https://www.mediawiki.org/ .

Maarten

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics