On 08/16/2013 04:12 AM, Alvaro del Castillo wrote:
Dear all,
El lun, 12-08-2013 a las 19:21 +0200, Quim Gil escribió:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Quim, thinking about those metrics, they are all related to Gerrit and centered around people more than activity.
Yes, after considering different possibilities I decided to focus on code contributions.
And yes, there is a lot of stress put on people. This is a community project and I personally think that the key metrics should focus on the population of this community. Having more & better contributors should eventually lead to having more & better software. The WMF can influence directly the amount of code produced by hiring more developers, but growing the community with volunteers is more tricky and this is why it is worth to look at the related data closely.
Then again, it is also true that it would be useful to know e.g. LOC contributed in the past quarter/year by the WMF, independents and other orgs. But maybe this could be integrated to "Who is contributing merged code each quarter"?
Any thoughts about the other data sources?
Out of code contributions, the only candidate to become a KPI that came to mind was the effectiveness responding to Bugzilla reports. Maybe you are right the priority of this one is higher than the age of code contributors. Thoughts?
From git, the commits metrics is a bit rough, but combining it with SLOC and files, is a good indicator of activity in the development. Also the tendencies for this metrics (week, month and year) are a good indicator of how the project is evolving. For example, last period of 365 days indicators points out that in general, the activity is growing in Wikimedia projects:
http://korma.wmflabs.org/browser/scm.html 63% in commits, 159% in authors and 103% in files.
Taking a look to this metrics, why commits are growing less than authors? Thinking in the metrics and tendencies and relations, we can reach pretty interesting questions to understanding in deep how Wikimedia development is going.
Yes, but we have to keep the initial list of KPIs short. Data about activity growth is good but it is slightly visible from the surface already. The KPIs proposed are less evident, more underground. They should be report the health of our project by sensing its foundations.
Taking a look to ITS, one metric that I think we are missing is *pending* issues (open issues but not closed). And this concept is something that could be applied to other data sources: the pending work to be done.
Tendencies and pending are two areas in which we can find interesting metrics that could evolve to KPIs.
In ITS we have also the time to close evolution for the 99% of tickets, 95% of them, 50% and 25%. "Time to" is always a good SLA (Service Level Agreement) indicator and at the end, SLA is a strategic position for the community.
Yes, see my comment about about effectiveness responding to Bugzilla feedback. The pending work in Gerrit is already reflected in the KPIs proposed.
In MLS with the current metrics, we can see that around 50% messages generate a thread (has some response), and for example, and interesting metric is that each person send 20 messages (ratio between total messages and senders). So in general, mailing list have feedback and people participate with several messages. Here the "Time to" shows that 95% of messages are attended in less 20 days and that the time to attention has grown last year. The care of the mailing lists is going down?
In demographics we can see pretty good information about all data sources community. Mailing list members are going down also (mailing lists are used less? where is the support of the projects moving on?),
Data about mailing lists, IRC and even wikis is good to have but I'm not sure they are key. These channels support the real deal: code repostories (and Bugzilla too). For instance, a % of discussions that happen now on Gerrit probably took place before at wikitech-l.
SCM new members are also going down but ITS members are growing. The software products start to be mature and more work is centered around quality?
http://korma.wmflabs.org/browser/demographics.html
That peak in SCM birth 12-15 months ago might be related to the move from SVN to Git + a peak of WMF hiring. The numbers in the past year more than double any number of newcomers before.
The usage of Bugzilla has been clearly growing in the past year, both in terms of new and regular users. Probably one of the causes is that we are releasing more code, sooner and more often, and our dev teams and dedicated bugmaster are succeeding selling Bugzilla as main feedback channel.
But we are digressing. :) Let's go back to the 5-6 KPIs that we really want to see. Any new proposal should include the KPI that would displace.
Please have your say. I will be updating http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following the discussion.