你好, 近排點呀? [0]
At Wikimania there was an improvised lunch-meeting about community metrics with Jesús González-Barahona (Bitergia), Sumana, RobLa and myself. The main conclusion was that http://korma.wmflabs.org needs to show a very few key metrics that could drive decisions affecting our strategy and resources.
Let's agree first on key factors to watch, in the scope of projects deployed in Wikimedia servers:
* Are the teams more efficient processing contributions? * Is the share of non-WMF contributions growing? * Are WMF and non-WMF contributions treated equally? * Are the attraction and retention of new contributors improving? * Are we improving the sustainability of our community?
If those are the key factors, these could be the key metrics:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Please have your say. I will be updating http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following the discussion.
[0] http://wikitravel.org/en/Cantonese_phrasebook#Phrase_list ;)
On Tue, Aug 13, 2013 at 1:21 AM, Quim Gil qgil@wikimedia.org wrote:
你好, 近排點呀? [0]
[0] http://wikitravel.org/en/Cantonese_phrasebook#Phrase_list ;)
*glares at OP*
https://en.wikivoyage.org/wiki/Cantonese_phrasebook#Phrase_list
Hi Quim!
Looks reasonable. I would also differentiated Wikimedia chapters and other 3rd parties. ----- Yury Katkov, WikiVote
On Mon, Aug 12, 2013 at 9:21 PM, Quim Gil qgil@wikimedia.org wrote:
你好, 近排點呀? [0]
At Wikimania there was an improvised lunch-meeting about community metrics with Jesús González-Barahona (Bitergia), Sumana, RobLa and myself. The main conclusion was that http://korma.wmflabs.org needs to show a very few key metrics that could drive decisions affecting our strategy and resources.
Let's agree first on key factors to watch, in the scope of projects deployed in Wikimedia servers:
- Are the teams more efficient processing contributions?
- Is the share of non-WMF contributions growing?
- Are WMF and non-WMF contributions treated equally?
- Are the attraction and retention of new contributors improving?
- Are we improving the sustainability of our community?
If those are the key factors, these could be the key metrics:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Please have your say. I will be updating http://www.mediawiki.org/wiki/** Community_metrics#Key_metricshttp://www.mediawiki.org/wiki/Community_metrics#Key_metricsfollowing the discussion.
[0] http://wikitravel.org/en/**Cantonese_phrasebook#Phrase_**listhttp://wikitravel.org/en/Cantonese_phrasebook#Phrase_list ;)
-- Quim Gil Technical Contributor Coordinator @ Wikimedia Foundation http://www.mediawiki.org/wiki/**User:Qgilhttp://www.mediawiki.org/wiki/User:Qgil
______________________________**_________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 08/13/2013 01:36 PM, Yury Katkov wrote:
Hi Quim!
Looks reasonable. I would also differentiated Wikimedia chapters and other 3rd parties.
Yes, agreed. I just wanted to keep the writing simple with "WMF and non-WMF". If we accomplish the first metric proposed we will have more level of detail:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents
Dear all,
El lun, 12-08-2013 a las 19:21 +0200, Quim Gil escribió:
你好, 近排點呀? [0]
At Wikimania there was an improvised lunch-meeting about community metrics with Jesús González-Barahona (Bitergia), Sumana, RobLa and myself. The main conclusion was that http://korma.wmflabs.org needs to show a very few key metrics that could drive decisions affecting our strategy and resources.
Yes, it BI terminology, this metrics are the KPI (Key Performance Indicators), so we can reuse this name.
Let's agree first on key factors to watch, in the scope of projects deployed in Wikimedia servers:
- Are the teams more efficient processing contributions?
- Is the share of non-WMF contributions growing?
- Are WMF and non-WMF contributions treated equally?
- Are the attraction and retention of new contributors improving?
- Are we improving the sustainability of our community?
If those are the key factors, these could be the key metrics:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Quim, thinking about those metrics, they are all related to Gerrit and centered around people more than activity.
Any thoughts about the other data sources? From git, the commits metrics is a bit rough, but combining it with SLOC and files, is a good indicator of activity in the development. Also the tendencies for this metrics (week, month and year) are a good indicator of how the project is evolving. For example, last period of 365 days indicators points out that in general, the activity is growing in Wikimedia projects:
http://korma.wmflabs.org/browser/scm.html 63% in commits, 159% in authors and 103% in files.
Taking a look to this metrics, why commits are growing less than authors? Thinking in the metrics and tendencies and relations, we can reach pretty interesting questions to understanding in deep how Wikimedia development is going.
Taking a look to ITS, one metric that I think we are missing is *pending* issues (open issues but not closed). And this concept is something that could be applied to other data sources: the pending work to be done.
Tendencies and pending are two areas in which we can find interesting metrics that could evolve to KPIs.
In ITS we have also the time to close evolution for the 99% of tickets, 95% of them, 50% and 25%. "Time to" is always a good SLA (Service Level Agreement) indicator and at the end, SLA is a strategic position for the community.
In MLS with the current metrics, we can see that around 50% messages generate a thread (has some response), and for example, and interesting metric is that each person send 20 messages (ratio between total messages and senders). So in general, mailing list have feedback and people participate with several messages. Here the "Time to" shows that 95% of messages are attended in less 20 days and that the time to attention has grown last year. The care of the mailing lists is going down?
In demographics we can see pretty good information about all data sources community. Mailing list members are going down also (mailing lists are used less? where is the support of the projects moving on?), SCM new members are also going down but ITS members are growing. The software products start to be mature and more work is centered around quality?
I think Quim that the starting point that you propose is a good one, specially the goal questions. The Goal (improve strategy decision and resources management)/Question/Metrics is the right methodology to follow. With this email I just want to broaden a bit the discussion.
Keep in mind that the deep review of all data in the current Community Dashboard is pending, so take all the numbers above with a grain of salt.
Cheers!
Please have your say. I will be updating http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following the discussion.
[0] http://wikitravel.org/en/Cantonese_phrasebook#Phrase_list ;)
On 08/16/2013 04:12 AM, Alvaro del Castillo wrote:
Dear all,
El lun, 12-08-2013 a las 19:21 +0200, Quim Gil escribió:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Quim, thinking about those metrics, they are all related to Gerrit and centered around people more than activity.
Yes, after considering different possibilities I decided to focus on code contributions.
And yes, there is a lot of stress put on people. This is a community project and I personally think that the key metrics should focus on the population of this community. Having more & better contributors should eventually lead to having more & better software. The WMF can influence directly the amount of code produced by hiring more developers, but growing the community with volunteers is more tricky and this is why it is worth to look at the related data closely.
Then again, it is also true that it would be useful to know e.g. LOC contributed in the past quarter/year by the WMF, independents and other orgs. But maybe this could be integrated to "Who is contributing merged code each quarter"?
Any thoughts about the other data sources?
Out of code contributions, the only candidate to become a KPI that came to mind was the effectiveness responding to Bugzilla reports. Maybe you are right the priority of this one is higher than the age of code contributors. Thoughts?
From git, the commits metrics is a bit rough, but combining it with SLOC and files, is a good indicator of activity in the development. Also the tendencies for this metrics (week, month and year) are a good indicator of how the project is evolving. For example, last period of 365 days indicators points out that in general, the activity is growing in Wikimedia projects:
http://korma.wmflabs.org/browser/scm.html 63% in commits, 159% in authors and 103% in files.
Taking a look to this metrics, why commits are growing less than authors? Thinking in the metrics and tendencies and relations, we can reach pretty interesting questions to understanding in deep how Wikimedia development is going.
Yes, but we have to keep the initial list of KPIs short. Data about activity growth is good but it is slightly visible from the surface already. The KPIs proposed are less evident, more underground. They should be report the health of our project by sensing its foundations.
Taking a look to ITS, one metric that I think we are missing is *pending* issues (open issues but not closed). And this concept is something that could be applied to other data sources: the pending work to be done.
Tendencies and pending are two areas in which we can find interesting metrics that could evolve to KPIs.
In ITS we have also the time to close evolution for the 99% of tickets, 95% of them, 50% and 25%. "Time to" is always a good SLA (Service Level Agreement) indicator and at the end, SLA is a strategic position for the community.
Yes, see my comment about about effectiveness responding to Bugzilla feedback. The pending work in Gerrit is already reflected in the KPIs proposed.
In MLS with the current metrics, we can see that around 50% messages generate a thread (has some response), and for example, and interesting metric is that each person send 20 messages (ratio between total messages and senders). So in general, mailing list have feedback and people participate with several messages. Here the "Time to" shows that 95% of messages are attended in less 20 days and that the time to attention has grown last year. The care of the mailing lists is going down?
In demographics we can see pretty good information about all data sources community. Mailing list members are going down also (mailing lists are used less? where is the support of the projects moving on?),
Data about mailing lists, IRC and even wikis is good to have but I'm not sure they are key. These channels support the real deal: code repostories (and Bugzilla too). For instance, a % of discussions that happen now on Gerrit probably took place before at wikitech-l.
SCM new members are also going down but ITS members are growing. The software products start to be mature and more work is centered around quality?
http://korma.wmflabs.org/browser/demographics.html
That peak in SCM birth 12-15 months ago might be related to the move from SVN to Git + a peak of WMF hiring. The numbers in the past year more than double any number of newcomers before.
The usage of Bugzilla has been clearly growing in the past year, both in terms of new and regular users. Probably one of the causes is that we are releasing more code, sooner and more often, and our dev teams and dedicated bugmaster are succeeding selling Bugzilla as main feedback channel.
But we are digressing. :) Let's go back to the 5-6 KPIs that we really want to see. Any new proposal should include the KPI that would displace.
Please have your say. I will be updating http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following the discussion.
Hi!
El lun, 19-08-2013 a las 13:05 -0700, Quim Gil escribió:
On 08/16/2013 04:12 AM, Alvaro del Castillo wrote:
Dear all,
El lun, 12-08-2013 a las 19:21 +0200, Quim Gil escribió:
# Who is contributing merged code each quarter? More origins == Better. ## WMF / WMDE / other Wikimedia / companies / OSS projects / independents ## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better. ## Authored by WMF/WMDE employees vs independent. Should be the same. ## Merged vs rejected. Similar progress? ## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months. More == Better. ## % of merged / rejected. ## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and regular decline == Better. ## Number of contributions from each age group in the past quarters. ## WMF / non-WMF ratio for each age group.
Quim, thinking about those metrics, they are all related to Gerrit and centered around people more than activity.
Yes, after considering different possibilities I decided to focus on code contributions.
And yes, there is a lot of stress put on people. This is a community project and I personally think that the key metrics should focus on the population of this community. Having more & better contributors should eventually lead to having more & better software. The WMF can influence directly the amount of code produced by hiring more developers, but growing the community with volunteers is more tricky and this is why it is worth to look at the related data closely.
Then again, it is also true that it would be useful to know e.g. LOC contributed in the past quarter/year by the WMF, independents and other orgs. But maybe this could be integrated to "Who is contributing merged code each quarter"?
Ok!
Any thoughts about the other data sources?
Out of code contributions, the only candidate to become a KPI that came to mind was the effectiveness responding to Bugzilla reports. Maybe you are right the priority of this one is higher than the age of code contributors. Thoughts?
From git, the commits metrics is a bit rough, but combining it with SLOC and files, is a good indicator of activity in the development. Also the tendencies for this metrics (week, month and year) are a good indicator of how the project is evolving. For example, last period of 365 days indicators points out that in general, the activity is growing in Wikimedia projects:
http://korma.wmflabs.org/browser/scm.html 63% in commits, 159% in authors and 103% in files.
Taking a look to this metrics, why commits are growing less than authors? Thinking in the metrics and tendencies and relations, we can reach pretty interesting questions to understanding in deep how Wikimedia development is going.
Yes, but we have to keep the initial list of KPIs short.
Sure! I just want to be sure that the complete scenario was analyzed. It is great to have a list of KPIs short so we can concentrate in them.
Data about activity growth is good but it is slightly visible from the surface already. The KPIs proposed are less evident, more underground. They should be report the health of our project by sensing its foundations.
Taking a look to ITS, one metric that I think we are missing is *pending* issues (open issues but not closed). And this concept is something that could be applied to other data sources: the pending work to be done.
Tendencies and pending are two areas in which we can find interesting metrics that could evolve to KPIs.
In ITS we have also the time to close evolution for the 99% of tickets, 95% of them, 50% and 25%. "Time to" is always a good SLA (Service Level Agreement) indicator and at the end, SLA is a strategic position for the community.
Yes, see my comment about about effectiveness responding to Bugzilla feedback. The pending work in Gerrit is already reflected in the KPIs proposed.
In MLS with the current metrics, we can see that around 50% messages generate a thread (has some response), and for example, and interesting metric is that each person send 20 messages (ratio between total messages and senders). So in general, mailing list have feedback and people participate with several messages. Here the "Time to" shows that 95% of messages are attended in less 20 days and that the time to attention has grown last year. The care of the mailing lists is going down?
In demographics we can see pretty good information about all data sources community. Mailing list members are going down also (mailing lists are used less? where is the support of the projects moving on?),
Data about mailing lists, IRC and even wikis is good to have but I'm not sure they are key. These channels support the real deal: code repostories (and Bugzilla too). For instance, a % of discussions that happen now on Gerrit probably took place before at wikitech-l.
Ok, so MLS and IRC will be fixed at its current state. And SCR will be merged with SCM in code contributions panel.
SCM new members are also going down but ITS members are growing. The software products start to be mature and more work is centered around quality?
http://korma.wmflabs.org/browser/demographics.html
That peak in SCM birth 12-15 months ago might be related to the move from SVN to Git + a peak of WMF hiring. The numbers in the past year more than double any number of newcomers before.
The usage of Bugzilla has been clearly growing in the past year, both in terms of new and regular users. Probably one of the causes is that we are releasing more code, sooner and more often, and our dev teams and dedicated bugmaster are succeeding selling Bugzilla as main feedback channel.
But we are digressing. :) Let's go back to the 5-6 KPIs that we really want to see. Any new proposal should include the KPI that would displace.
Great!!!!
Cheers
Please have your say. I will be updating http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following the discussion.
In demographics we can see pretty good information about all data sources community. Mailing list members are going down also (mailing lists are used less? where is the support of the projects moving on?),
Data about mailing lists, IRC and even wikis is good to have but I'm not
sure they are key. These channels support the real deal: code repostories (and Bugzilla too). For instance, a % of discussions that happen now on Gerrit probably took place before at wikitech-l.
I find that hard to believe (unless you are going back to the pre Special:CodeReview era.) Perhaps there is an increase of people talking face to face/internal wmf meetings over using wikitech-l. Perhaps the (possibly imagined) increase in bikeshedding is making people less likely to post to the list.
-bawolff
On Mon, 2013-08-19 at 13:05 -0700, Quim Gil wrote:
Out of code contributions, the only candidate to become a KPI that came to mind was the effectiveness responding to Bugzilla reports. Maybe you are right the priority of this one is higher than the age of code contributors. Thoughts?
Refering to Bugzilla only:
As listed on http://www.mediawiki.org/wiki/Talk:Community_metrics , I'd be interested in two things:
Average time to resolve a ticket: Delta between creation of a report and setting the status RESOLVED. Preferably separately for "real" bugs and severity=enhancement. Might also make sense to have a specific one for setting the status RESOLVED && resolution FIXED, to only cover code fixes.
And average time to response to new tickets: Delta between creation of a report and any next change (setting priority, adding an initial comment, etc.) which is NOT done by the original reporter. If that's possible.
All in all, the proposed metrics look really good to me, great work! Looking forward to seeing most of that implemented!
andre
Hi all,
Recapping a conversation that I had in person with Quim:
On Mon, Aug 12, 2013 at 10:21 AM, Quim Gil qgil@wikimedia.org wrote:
# Queue of open Gerrit change requests in relation to total amount of contributions. Shorter == Better. ## Same points as above.
I'm personally interested in the absolute size of the Gerrit change request queue, not in relation to anything else, but just as an absolute number (over time). When this number gets large, we should all probably be interested in the following: * Breakdown by age of changeset. * Breakdown by project family (e.g. MediaWiki core, Wikimedia deployed extensions, all MediaWiki extensions hosted in Gerrit).
Rob
On Tue, 20 Aug 2013 00:48:09 +0200, Rob Lanphier robla@wikimedia.org wrote:
I'm personally interested in the absolute size of the Gerrit change request queue, not in relation to anything else, but just as an absolute number (over time). When this number gets large, we should all probably be interested in the following:
- Breakdown by age of changeset.
- Breakdown by project family (e.g. MediaWiki core, Wikimedia
deployed extensions, all MediaWiki extensions hosted in Gerrit).
This already exists, by the way. https://www.mediawiki.org/wiki/Gerrit/Navigation#Reports
<quote name="Bartosz Dziewoński" date="2013-08-20" time="12:16:09 +0200">
On Tue, 20 Aug 2013 00:48:09 +0200, Rob Lanphier robla@wikimedia.org wrote:
I'm personally interested in the absolute size of the Gerrit change request queue, not in relation to anything else, but just as an absolute number (over time). When this number gets large, we should all probably be interested in the following:
- Breakdown by age of changeset.
- Breakdown by project family (e.g. MediaWiki core, Wikimedia
deployed extensions, all MediaWiki extensions hosted in Gerrit).
This already exists, by the way. https://www.mediawiki.org/wiki/Gerrit/Navigation#Reports
Neat.
Feature request: Make a nice graph of the data that shows those numbers over time (per day I suppose?). I hear we have some analytics software we can use... ;)
Greg
On Tue, 20 Aug 2013 22:13:41 +0200, Greg Grossmeier greg@wikimedia.org wrote:
Feature request: Make a nice graph of the data that shows those numbers over time (per day I suppose?). I hear we have some analytics software we can use...
Last time I graphed the data about changesets waiting to be merged, it showed a nice linear growth since the start of these reports (February, IIRC). I don't think I have it saved anywhere, though, but it should be easy to build.
Thank you for all the feedback!
http://www.mediawiki.org/wiki/Community_metrics#Key_performance_indicators has been updated, hopefully including all the agreed points.
There are open questions, but they are related with implementation details e.g. what type of charts should be used for each case. We will go back to the discussion and work at the wiki page and th Analytics list.
wikitech-l@lists.wikimedia.org