We have the data so we could backfill back to May 2015, but not beyond
that. The backfill process would take quite a long time, however, because
there's a lot of data to crunch through. So we haven't decided to kick off
that back-filling until people ask us for it. So do ask for it if that's
useful.
If you need data going further back, we have the pagecounts-all-sites
dataset, which at least includes mobile data as well. This is available
per-article starting in late 2014.
As you can see, this is confusing which is why I just started a thread on
this list about simplifying it. Please chime in there if you have an
opinion about the cleanup.
On Thu, Dec 24, 2015 at 8:41 AM, Maurice Vergeer <m.vergeer(a)maw.ru.nl>
wrote:
Dear Dan,
thanks for this information. I looked at it the new data. Are these data
also available per subject or only aggregated for the entire site? I am
mainly interested in statistics about specific pages in Wikipedia.
For now the traditional measurements are sufficient. I.e. I assume spider
traffic is sort of random on a daily basis, and because I will do
timeseries analysis, this will have little affect on the findings
.
On a similar topic, when a user is logged in to perform edits to a page
and he refreshes the wikipedia-page, does this register as a visit? Or do
only not loggedon visits register as a visit?
To conclude, I think Wikipedia data are very nteresting for scientific
study. I've seen some studies in information science, but in my field -
Communicatoin science - very little. I hope to change that with my
contribution :-)
Again thanks Dan,
Maurice
On Thu, Dec 24, 2015 at 2:07 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
Maurice, if you're looking for recent data,
we have a better source:
dumps.wikimedia.org/other/pageviews/
This is better than pagecounts-raw because it excludes spider (crawler)
traffic, some other automata traffic, and includes mobile traffic. We have
been slow to announce this and change the pages because it's a confusing
change.
On Thu, Dec 24, 2015 at 4:44 AM, Maurice Vergeer <m.vergeer(a)maw.ru.nl>
wrote:
Dear Federico Leva,
thank you very much for the clarification and the quick reply.
Because I want to relate the pagecounts to events taking place in the
Netherlands (e.g televised political debates), knowing what the timezone is
is important.
Again thanks and best regards
Maurice Vergeer
On Thu, Dec 24, 2015 at 10:38 AM, Federico Leva (Nemo) <
nemowiki(a)gmail.com> wrote:
> Maurice Vergeer, 24/12/2015 10:16:
>
>> I am looking at your pagecounts as archived on
>>
https://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-12/
>> Can you tell me from what timezone the time stamps originate?
>>
>
> Any and all timestamps in
dumps.wikimedia.org are in UTC. Apparently
> this is not as obvious as generally thought, so I've added a note:
>
https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData%2FPagecou…
>
> Nemo
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
________________________________________________
Maurice Vergeer
To contact me, see
http://mauricevergeer.nl/node/5
To see my publications, see
http://mauricevergeer.nl/node/1
________________________________________________
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
________________________________________________
Maurice Vergeer
To contact me, see
http://mauricevergeer.nl/node/5
To see my publications, see
http://mauricevergeer.nl/node/1
________________________________________________
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org