Re: [Wikitech-l] Complete (basic) analysis of MediaWiki

5 Dec 2012

      On Mon, 2012-12-03 at 23:46 +0100, Platonides wrote:
...
El 03/12/12 22:45, Jesus M. Gonzalez-Barahona escribió:
...
On Mon, 2012-12-03 at 21:04 +0100, Platonides wrote:
...
El 03/12/12 19:40, Federico Leva (Nemo) escribió:
...
That data is hardly useful, it doesn't explain what it refers to
I guess I missed your message, Federico.
He forgot to keep you in CC, so it was sent only to the mailing list.
Thanks. I happen to be a subscriber to the list, but I automatically
archive it, so I didn't notice the message. I already saw it.
...
...
...
I agree a glossary of each term would be useful.
It took me a while to realise that committers/closers/senders where the
terms used for users of git/bugzilla/mailing list.
Well, in fact commiters are committers to the git repository (you also
have authors, see below), closers are specifically ticket closers (you
also have people opening or changing tickets) and senders are indeed
senders of mailing list messages. We'll work to make this much more
clear. Again, thanks for the suggestion.
...
They should track authors instead of committers, though (preferably
skipping merge commits)
We do both. In the summary (main) page you have authors in the summary
(orange) chart, since authors seemed more meaningful than committers.
Same for the "blue" chart in that page. In the source code page you have
committers (orange chart) and both authors and committers (blue charts),
for a more detailed comparison.
I was specifically thinking in the table of Top committers.
Also, the summary page has an authors graph but
http://bitergia.com/public/previews/2012_11_mediawiki/scm.html has a
committers one.
When the committer is different than the author there are usually two
options:

It was a merge and the committer is 'gerrit'.
The patchset was (slightly) changed by the committer from the original

by the author.
There's also a less common one of committing a patch from a different
source, such as a bugzilla patch.
Thanks for the info.
...
Number of commits by gerrit are meaningless, and committers with little
changes inflate some numbers but are not too useful. Number of comments
/ approvals in gerrit would be more appropiate than that.
Equally, the author field of merges should IMHO be ignore since that's
not a commit which really touches the code (could be measured in a
different statistic), so many commits produce two entries.
You're right, thanks. However, we intended to measure "raw commits", as
found in the git repo. One of the filters after this "first pass"
filters out those commits. You're right that in this case at least,
those numbers would be more appropriate.
...
...
...
Seems that Jesús did a fine job.
It could be polished quite more with some local knowledge, merging
users, hiding bots, etc.
Thanks a lot. We usually go, after this first stage, with that
identification of bots, unification of identities, identification of
large commits, classification of different kinds of tickets, etc. In
this case, we were mainly testing the automated (first) stage: the
second one, as you mention, usually needs some detailed knowledge about
the project, and some manual intervention.
Sure. I wasn't intending to put pressure on you.
None taken ;-)
...
A few quirks I noticed:

noreply@sourceforge.net is abusing its second place as sender (2525).
I bet the two brion are the same, with different emails

(4561+1285=5846, wow!)
We will look into that.
...
l10n-bot is indeed a bot.
On svn localisation commits weren't done with a specific account, but
you can look for commit messages like «Localisation updates for core and
extension messages from translatewiki.net»
ok
...
Commits migrated from svn will have emails of
username@users.mediawiki.org All commits done on git use a different
one. Moreover, some people have used different mails (see other threads
on the mailing list about this in ohloh).
We noticed this. But it is a bit risky to assume that
xxx@users.mediawiki.org is the same as xxx@otherdomain: probably in most
cases, the merge is perfect, but in some it could be wrong. We preferred
to provide the raw numbers since we didn't have resources to do the
manual checking needed. But we can try to produce accumulated stats
assuming that match: probably it would be accurate enough.
[...]
Saludos,
Jesus.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Complete (basic) analysis of MediaWiki