Hi Jörg, :]

Do you mean top 250K most viewed articles in de.wikipedia.org?

If so, I think you can get that from the dumps indeed. You can find 2016 hourly pageview stats by article for all wikis here: https://dumps.wikimedia.org/other/pageviews/2016/

Note that the wiki codes (first column) you're interested in are: de, de.m and de.zero.
The third column holds the number of pageviews you're after.
Also, this data set does not include bot traffic as recognized by the pageview definition.
As files are hourly and contain data for all wikis, you'll need some aggregation and filtering.


On Mon, Mar 6, 2017 at 2:59 AM, Jörg Jung <joerg.jung@retevastum.de> wrote:
Ladies, gents,

for a project i plan i'd need the following data:

Top 250K sites for 2016 in project de.wikipedia.org, user-access.

I only need the name of the site and the corrsponding number of
user-accesses (all channels) for 2016 (sum over the year).

As far as i can see i can't get that data via REST or by aggegating dumps.

So i'd like to ask here, if someone likes to helpout.

Thanx, cheers, JJ

Jörg Jung, Dipl. Inf. (FH)
Hasendriesch 2
D-53639 Königswinter
E-Mail:     joerg.jung@retevastum.de
Web:        www.retevastum.de

Analytics mailing list

Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation