Re: [Analytics] Stat variances over time

18 Mar 2013


      On 03/17/2013 08:15 PM, Erik Zachte wrote:
...
C1b) Definition of what constitutes an article can even change more profoundly:
Recently the Swedish Wikipedia started to add bot created
articles on a large scale, which has previously been done
to the Dutch and some other Wikipedias. These articles
are not bad, they cite sources and are accurate, so they
should be counted among the existing articles. But they
are not very popular, since they cover obscure topics.
This leads to the idea that perhaps we should count
articles that are actually read. It's easy to identify those
articles that are very short or don't cite sources, but in
order to count articles that aren't read, we need to be
sure that robots of all kinds are excluded.
In excluding robot accesses from the visitor statistics,
it's also relevant to ask whether accesses from editors
should be counted. If I'm a steam engine enthusiast and
writes articles about every engineer and railroad, maybe
I'm the only audience for those articles. When I want to
know if my articles have any readership, I don't want to
include myself in the audience count. If I'm only writing
for my own reading, then I don't really need Wikipedia,
so the usefulness of Wikipedia starts when the second
human reader turns up.
Are there any ideas or strategies for a good audience
count?
If, instead of page views, we were to count the number
of different IP addresses, then each bot or editor would
just count as one identity, and this would reduce their
impact.
If we can define a good measurement for audience, then
it would start a new statistics series and we would not
have any problems with mismatch with any previous data.
-- 
   Lars Aronsson (lars@aronsson.se)
   Aronsson Datateknik - http://aronsson.se

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] Stat variances over time