Hi,
in light of the current switch from Wikistats 1 to Wikistats 2 I would like to express a strong desire to get some additional features for the statistcs. The rationale for this request is described below:
1. How many articles make up for 90 / 95 / 99 percent of all page views over a certain period? Which articles are these? (live articles excluding WP:xx and Spezial:xxx etc.)
2. Which share of page views goes to the Top 5% / Top 10% of our pages?
3. List of pages with less than x views per year (x = e.g. 12, 25, 100)
*Reasoning / Rationale*:
I am limiting myself to the German Wikipedia, assuming that the situation is similar in other language WP's.
In the German Wikipedia we meanwhile have more than 2.2 million articles. This produces an enormous maintenance workload, to keep up the quality of articles. We clearly have too few people to do that maintenance work.
This means that we have to focus our maintenance efforts on those articles, which really matter and those are much more than the list of 1000 we get from the Top-Views stats. The reports suggested above will give us exactly the information required to have an informed discussion about the articles we should focus on. And the report under #3 will give us a means to identify articles which we could either delete or clearly label as 'out of maintenance'.
We do know that certain articles are 'en vogue' for a short period, as they relate to current news topics. Therefore we must be able to have above reports over a longer period (at least a year, maybe two) to identify those articles which are really the long term favourites.
Best regards
Peter
my user page: Wikipeter-HH https://de.wikipedia.org/wiki/Benutzer:Wikipeter-HH
Hello,
Several things come to mind:
Top views provides much of this info digested in a way that would not be hard to calculate what you want, gets data from pageviewAPI and does some useful filtering: https://tools.wmflabs.org/topviews/?project=de.wikipedia.org&platform=al...
You probably know this but let me point out that there are pageview dumps from which you could calculate 1, 2 and 3 . We understand that perhaps your request is not having to do this calculation yourself for dewiki but just in case you do not know all data about detailed pageviews for a project is available: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageviews#Di...
If you think top views is not sufficient for your use case, please do file a phab ticket with your request: https://phabricator.wikimedia.org/maniphest/task/edit/?title=Wikistats%20New...
Finally, I think any pageview focused effort needs to take into account the very significant number of bots that are not marked as such in our system (and the top1000 list is an example of data skewed by bot traffic), top views does some "crowsourced" filtering and again, I think for use case will be very useful.
Thanks,
Nuria
On Thu, Feb 7, 2019 at 1:47 AM WikiPeter-HH WikiPeter-HH@online.de wrote:
Hi,
in light of the current switch from Wikistats 1 to Wikistats 2 I would like to express a strong desire to get some additional features for the statistcs. The rationale for this request is described below:
- How many articles make up for 90 / 95 / 99 percent of all page views
over a certain period? Which articles are these? (live articles excluding WP:xx and Spezial:xxx etc.)
Which share of page views goes to the Top 5% / Top 10% of our pages?
List of pages with less than x views per year (x = e.g. 12, 25, 100)
*Reasoning / Rationale*:
I am limiting myself to the German Wikipedia, assuming that the situation is similar in other language WP's.
In the German Wikipedia we meanwhile have more than 2.2 million articles. This produces an enormous maintenance workload, to keep up the quality of articles. We clearly have too few people to do that maintenance work.
This means that we have to focus our maintenance efforts on those articles, which really matter and those are much more than the list of 1000 we get from the Top-Views stats. The reports suggested above will give us exactly the information required to have an informed discussion about the articles we should focus on. And the report under #3 will give us a means to identify articles which we could either delete or clearly label as 'out of maintenance'.
We do know that certain articles are 'en vogue' for a short period, as they relate to current news topics. Therefore we must be able to have above reports over a longer period (at least a year, maybe two) to identify those articles which are really the long term favourites.
Best regards
Peter
my user page: Wikipeter-HH https://de.wikipedia.org/wiki/Benutzer:Wikipeter-HH _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics