On October 30, updates were made to how metrics are calculated for the Vital Signs dashboards [1]. The result is an apparent jump up or down on some of the metrics starting October 30th. This is because we did not update the existing historical data. We are planning on recalculating all the historical data after more work is done on the backend over the next month.
Here are the changes on the metrics:
- Namespace Edits and Pages Created now include pages in all namespaces and pages that have been deleted. The plots for these metrics generally show a step up. On wikis where most of activity occurs on pages other than in namespace '0' (like Meta and Commons), you can see a dramatic difference in the data [2].
- We exclude bots in Rolling Active Editor, Rolling Surviving New Active Editor and Rolling Recurring Old Active Editor. The plots for these metrics show a small step down. This change is not as conspicuous as the previous one.
If you were relying on this data right now, and need it consistent across time, please speak up.
[1] https://metrics.wmflabs.org/static/public/dash/ [2] https://metrics.wmflabs.org/static/public/dash/#projects=commonswiki,metawik...
- Namespace Edits and Pages Created now include pages in all namespaces
and pages that have been deleted. The plots for these metrics generally show a step up. On wikis where most of activity occurs on pages other than in namespace '0' (like Meta and Commons), you can see a dramatic difference in the data [2].
I just spoke to Kevin about this and there may be a disconnect around this.
For edits and pages created, it's important for us to be able to separate out main namespace edits and pages created from all other namespaces. Actual content creation happens on the main namespace, so lumping all namespaces together actually reduces visibility into content creation.
Howie
On 4 November 2014 11:11, Howie Fung hfung@wikimedia.org wrote:
- Namespace Edits and Pages Created now include pages in all namespaces
and pages that have been deleted. The plots for these metrics generally show a step up. On wikis where most of activity occurs on pages other than in namespace '0' (like Meta and Commons), you can see a dramatic difference in the data [2].
I just spoke to Kevin about this and there may be a disconnect around
this.
For edits and pages created, it's important for us to be able to separate out main namespace edits and pages created from all other namespaces. Actual content creation happens on the main namespace, so lumping all namespaces together actually reduces visibility into content creation.
Yeah; I was expecting this to be using $wgContentNamespaces (which varies per wiki), rather than hard-coding either NS0 (which is wrong e.g. for Commons or Wikidata) or all namespaces (which as Howie says is wrong because it covers non-content work).
J.
Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
One bit of complication: How do you feel about the draft namespace for enwiki? Should it be included in content page creations?
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
1. http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a...
-Aaron
On Tue, Nov 4, 2014 at 1:18 PM, James Forrester jforrester@wikimedia.org wrote:
On 4 November 2014 11:11, Howie Fung hfung@wikimedia.org wrote:
- Namespace Edits and Pages Created now include pages in all namespaces
and pages that have been deleted. The plots for these metrics generally show a step up. On wikis where most of activity occurs on pages other than in namespace '0' (like Meta and Commons), you can see a dramatic difference in the data [2].
I just spoke to Kevin about this and there may be a disconnect around
this.
For edits and pages created, it's important for us to be able to separate out main namespace edits and pages created from all other namespaces. Actual content creation happens on the main namespace, so lumping all namespaces together actually reduces visibility into content creation.
Yeah; I was expecting this to be using $wgContentNamespaces (which varies per wiki), rather than hard-coding either NS0 (which is wrong e.g. for Commons or Wikidata) or all namespaces (which as Howie says is wrong because it covers non-content work).
J.
James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org | @jdforrester
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On 4 November 2014 12:00, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
Yeah, having both would be great but I don't want to demand the world on a stick. ;-)
One bit of complication: How do you feel about the draft namespace for
enwiki? Should it be included in content page creations?
That should be in $wgContentNamespaces but unfortunately isn't (see the config file http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php). I'll get that fixed.
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a...
Fair point.
J.
Created tracking bug -- please add yourselves to the cc if desired.
https://bugzilla.wikimedia.org/show_bug.cgi?id=72973
-Toby
On Tue, Nov 4, 2014 at 12:07 PM, James Forrester jforrester@wikimedia.org wrote:
On 4 November 2014 12:00, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
Yeah, having both would be great but I don't want to demand the world on a stick. ;-)
One bit of complication: How do you feel about the draft namespace for
enwiki? Should it be included in content page creations?
That should be in $wgContentNamespaces but unfortunately isn't (see the config file http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php). I'll get that fixed.
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a...
Fair point.
J.
James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org | @jdforrester
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks Toby! :-)
On 4 November 2014 12:38, Toby Negrin tnegrin@wikimedia.org wrote:
Created tracking bug -- please add yourselves to the cc if desired.
https://bugzilla.wikimedia.org/show_bug.cgi?id=72973
-Toby
On Tue, Nov 4, 2014 at 12:07 PM, James Forrester <jforrester@wikimedia.org
wrote:
On 4 November 2014 12:00, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
Yeah, having both would be great but I don't want to demand the world on a stick. ;-)
One bit of complication: How do you feel about the draft namespace for
enwiki? Should it be included in content page creations?
That should be in $wgContentNamespaces but unfortunately isn't (see the config file http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php). I'll get that fixed.
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a...
Fair point.
J.
James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org | @jdforrester
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
to add some context to the present approach, you may remember that when we defined Editor Model metrics we started from the highest possible level of aggregation (i.e. all namespaces combined, archive table included). See rationale below from a previous email exchange:
we tried to stick to two general principles:
1) we want to count users making contributions to a project as a whole. Establishing that only “content activity” should be considered means that someone uploading a picture, editing a template, drafting an article outside of ns0 (we have a new Draft namespace), writing or contributing to a new policy, helping coordinate a wikiproject, i.e. all fundamental activities that contribute to the growth of the project, would be discounted as an editor. By this token, someone writing an entire article outside of the main namespace would not be included as an editor while a vandal fighter only reverting edits at the push of a button would be considered as a contributor. The point I’m trying to make is that establishing what “content” means is very arbitrary and we should have a measure of overall participation to a project, followed by more granular metrics by type of contribution (see next point).
2) instead of starting with a list of exclusions (i.e. we will only measure a subset of ns0 edits on articles meeting specific criteria such as countable pages), we will introduce breakdowns that inform us about specific types of activity. “Namespace” is a possible proxy for types of content, but not necessarily the best or the only one. One day, I’d like to be able to monitor active typo-fixers or template-editors, but I believe we should start from the highest possible level and count total activity or total unique editors before breaking them down.
Adding a NS dimension or other criteria to filter top-level metrics sounds like a totally legitimate request as a metric breakdown.
Dario
On Nov 4, 2014, at 12:55 PM, James Forrester jforrester@wikimedia.org wrote:
Thanks Toby! :-)
On 4 November 2014 12:38, Toby Negrin <tnegrin@wikimedia.org mailto:tnegrin@wikimedia.org> wrote: Created tracking bug -- please add yourselves to the cc if desired.
https://bugzilla.wikimedia.org/show_bug.cgi?id=72973 https://bugzilla.wikimedia.org/show_bug.cgi?id=72973
-Toby
On Tue, Nov 4, 2014 at 12:07 PM, James Forrester <jforrester@wikimedia.org mailto:jforrester@wikimedia.org> wrote: On 4 November 2014 12:00, Aaron Halfaker <ahalfaker@wikimedia.org mailto:ahalfaker@wikimedia.org> wrote: Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
Yeah, having both would be great but I don't want to demand the world on a stick. ;-)
One bit of complication: How do you feel about the draft namespace for enwiki? Should it be included in content page creations?
That should be in $wgContentNamespaces but unfortunately isn't (see the config file http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php). I'll get that fixed.
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
- http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a... http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_active_editors
Fair point.
J.
James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org mailto:jforrester@wikimedia.org | @jdforrester
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
-- James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org mailto:jforrester@wikimedia.org | @jdforrester _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
+1
We should always be careful about baking assumptions of EnWIki origin into metrics and tools that are intended for use across our projects.
- J
On Tue, Nov 4, 2014 at 9:25 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
to add some context to the present approach, you may remember that when we defined Editor Model metrics we started from the highest possible level of aggregation (i.e. all namespaces combined, archive table included). See rationale below from a previous email exchange:
we tried to stick to two general principles:
- we want to count users making contributions to a project as a whole.
Establishing that only “content activity” should be considered means that someone uploading a picture, editing a template, drafting an article outside of ns0 (we have a new Draft namespace), writing or contributing to a new policy, helping coordinate a wikiproject, i.e. all fundamental activities that contribute to the growth of the project, would be discounted as an editor. By this token, someone writing an entire article outside of the main namespace would not be included as an editor while a vandal fighter only reverting edits at the push of a button would be considered as a contributor. The point I’m trying to make is that establishing what “content” means is very arbitrary and we should have a measure of overall participation to a project, followed by more granular metrics by type of contribution (see next point).
- instead of starting with a list of exclusions (i.e. we will only
measure a subset of ns0 edits on articles meeting specific criteria such as countable pages), we will introduce breakdowns that inform us about specific types of activity. “Namespace” is a possible proxy for types of content, but not necessarily the best or the only one. One day, I’d like to be able to monitor active typo-fixers or template-editors, but I believe we should start from the highest possible level and count total activity or total unique editors before breaking them down.
Adding a NS dimension or other criteria to filter top-level metrics sounds like a totally legitimate request *as a metric breakdown.*
Dario
On Nov 4, 2014, at 12:55 PM, James Forrester jforrester@wikimedia.org wrote:
Thanks Toby! :-)
On 4 November 2014 12:38, Toby Negrin tnegrin@wikimedia.org wrote:
Created tracking bug -- please add yourselves to the cc if desired.
https://bugzilla.wikimedia.org/show_bug.cgi?id=72973
-Toby
On Tue, Nov 4, 2014 at 12:07 PM, James Forrester < jforrester@wikimedia.org> wrote:
On 4 November 2014 12:00, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Understood for page creations. The metric is named "Page creations". We ought to have a metric called "Content page creations" or "Unique content page creators".
Yeah, having both would be great but I don't want to demand the world on a stick. ;-)
One bit of complication: How do you feel about the draft namespace for
enwiki? Should it be included in content page creations?
That should be in $wgContentNamespaces but unfortunately isn't (see the config file http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php). I'll get that fixed.
As for edits, the correlation is so strong between edits to content and edits to other namespaces that it doesn't matter which we use when looking for trends.[1]
http://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_a...
Fair point.
J.
James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org | @jdforrester
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.
jforrester@wikimedia.org | @jdforrester _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics