Maurice, the data per-article is available starting in October: http://dumps.wikimedia.org/other/pageviews/2015/2015-10/

We have the data so we could backfill back to May 2015, but not beyond that.  The backfill process would take quite a long time, however, because there's a lot of data to crunch through.  So we haven't decided to kick off that back-filling until people ask us for it.  So do ask for it if that's useful.

If you need data going further back, we have the pagecounts-all-sites dataset, which at least includes mobile data as well.  This is available per-article starting in late 2014.

As you can see, this is confusing which is why I just started a thread on this list about simplifying it.  Please chime in there if you have an opinion about the cleanup.

On Thu, Dec 24, 2015 at 8:41 AM, Maurice Vergeer <m.vergeer@maw.ru.nl> wrote:
Dear Dan,

thanks for this information. I looked at it the new data. Are these data also available per subject or only aggregated for the entire site? I am mainly interested in statistics about specific pages in Wikipedia.
For now the traditional measurements are sufficient. I.e. I assume spider traffic is sort of random on a daily basis, and because I will do timeseries analysis, this will have little affect on the findings
.
On a similar topic, when a user is logged in to perform edits to a page and he refreshes the wikipedia-page, does this register as a visit? Or do only not loggedon visits register as a visit?

To conclude, I think Wikipedia data are very nteresting for scientific study. I've seen some studies in information science, but in my field - Communicatoin science - very little. I hope to change that with my contribution :-)

Again thanks Dan,
Maurice




On Thu, Dec 24, 2015 at 2:07 PM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
Maurice, if you're looking for recent data, we have a better source: dumps.wikimedia.org/other/pageviews/

This is better than pagecounts-raw because it excludes spider (crawler) traffic, some other automata traffic, and includes mobile traffic.  We have been slow to announce this and change the pages because it's a confusing change.

On Thu, Dec 24, 2015 at 4:44 AM, Maurice Vergeer <m.vergeer@maw.ru.nl> wrote:
Dear Federico Leva,
thank you very much for the clarification and the quick reply.

Because I want to relate the pagecounts to events taking place in the Netherlands (e.g televised political debates), knowing what the timezone is is important.

Again thanks and best regards
Maurice Vergeer

On Thu, Dec 24, 2015 at 10:38 AM, Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Maurice Vergeer, 24/12/2015 10:16:
I am looking at your pagecounts as archived on
https://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-12/
Can you tell me from what timezone the time stamps originate?

Any and all timestamps in dumps.wikimedia.org are in UTC. Apparently this is not as obvious as generally thought, so I've added a note: https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData%2FPagecounts-raw&type=revision&diff=240313&oldid=184255

Nemo

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



--
________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
________________________________________________

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
________________________________________________

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics