Hello,
Several things come to mind:
Top views provides much of this info digested in a way that would not be
hard to calculate what you want, gets data from pageviewAPI and does some
useful filtering:
https://tools.wmflabs.org/topviews/?project=de.wikipedia.org&platform=a…
You probably know this but let me point out that there are pageview dumps
from which you could calculate 1, 2 and 3 . We understand that perhaps your
request is not having to do this calculation yourself for dewiki but just
in case you do not know all data about detailed pageviews for a project is
available:
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageviews#D…
If you think top views is not sufficient for your use case, please do file
a phab ticket with your request:
https://phabricator.wikimedia.org/maniphest/task/edit/?title=Wikistats%20Ne…
Finally, I think any pageview focused effort needs to take into account the
very significant number of bots that are not marked as such in our system
(and the top1000 list is an example of data skewed by bot traffic), top
views does some "crowsourced" filtering and again, I think for use case
will be very useful.
Thanks,
Nuria
On Thu, Feb 7, 2019 at 1:47 AM WikiPeter-HH <WikiPeter-HH(a)online.de> wrote:
Hi,
in light of the current switch from Wikistats 1 to Wikistats 2 I would
like to express a strong desire to get some additional features for the
statistcs. The rationale for this request is described below:
1. How many articles make up for 90 / 95 / 99 percent of all page views
over a certain period? Which articles are these? (live articles excluding
WP:xx and Spezial:xxx etc.)
2. Which share of page views goes to the Top 5% / Top 10% of our pages?
3. List of pages with less than x views per year (x = e.g. 12, 25, 100)
*Reasoning / Rationale*:
I am limiting myself to the German Wikipedia, assuming that the situation
is similar in other language WP's.
In the German Wikipedia we meanwhile have more than 2.2 million articles.
This produces an enormous maintenance workload, to keep up the quality of
articles. We clearly have too few people to do that maintenance work.
This means that we have to focus our maintenance efforts on those
articles, which really matter and those are much more than the list of 1000
we get from the Top-Views stats. The reports suggested above will give us
exactly the information required to have an informed discussion about the
articles we should focus on. And the report under #3 will give us a means
to identify articles which we could either delete or clearly label as 'out
of maintenance'.
We do know that certain articles are 'en vogue' for a short period, as
they relate to current news topics. Therefore we must be able to have above
reports over a longer period (at least a year, maybe two) to identify those
articles which are really the long term favourites.
Best regards
Peter
my user page: Wikipeter-HH
<https://de.wikipedia.org/wiki/Benutzer:Wikipeter-HH>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics