are there issues with using the data from the IA?
Since that much predates our team record keeping of data issues the answer is that we do not know. Maybe someone in this list can chip in and we will add this answer to our dataset known issues which can be found here:
https://wikitech.wikimedia.org/wiki/Analytics/Archive/Data/Pagecounts-raw#Ev...
On Mon, May 1, 2017 at 7:51 AM, Cristian Consonni cristian@balist.es wrote:
Hi,
I am working on the pagecounts data for a project and I have noticed that data for the period September, 21st - September 30th, 2009 are missing.[1]
However I also found that some data for that period are avialable from The Internet Archive[2].
I see also that somehow at https://stats.grok.se there are some data available (see for example the page view data for"Influenza" on en.wiki[3]).
So, my question is: are there issues with using the data from the IA? I checked that besides for a difference for the file pagecounts-20090921-160000.gz which seems incomplete on dumps.wikimedia.org the rest of the files are the same.
Thanks in advance for your help.
Cristian [1]: https://dumps.wikimedia.org/other/pagecounts-raw/2009/2009-09/ [2]: https://archive.org/details/wikipedia_visitor_stats_200909 [3]: http://stats.grok.se/en/200909/Influenza
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics