On mar, 2013-11-12 at 14:26 -0800, Quim Gil wrote:
On 11/12/2013 11:38 AM, Federico Leva (Nemo) wrote:
Alvaro del Castillo, 12/11/2013 19:13:
Some weirdnesses I noted:
- Total edit counts don't match, compare e.g. Krinkle, Kgh, Jack,
Jeroen: http://stats.wikimedia.org/wikispecial/EN/TablesWikipediaMEDIAWIKI.htm#wikip... . This probably means that you are considering the wrong namespaces; please use content namespaces.
Ok! taking a look to namespaces. It is easy to filter them using the API. The key is to understand which are content namespaces. Some recommendation?
See the definition: https://www.mediawiki.org/wiki/Content_namespace
Maybe the only namespace that would be worth filtering is User / User_talk. All the rest are equally worth in wikis of software projects (the natural scope of Metrics Grimoire and the precise example of mediawiki.org).
After looking to other mediawikis it seems that by default all contents goes to namespace 0. So covering this namespace is enough in this cases.
In sites like mediawiki.org there are other content namespaces with lots of wiki pages. In mediawiki.org, namespaces 100+102 have the same pages as default namespace 0 (7K).
Taking a look to other namespaces, I am not sure if they add information about activity or could add noise to the metrics.
Quim, if you want, I can create a viz with all namespaces or add to the tool and option to use *all* namespaces available in the mediawiki based site.
Cheers