Hi Christian,

Also do not recall what happened, but maybe those files were removed purposefully because of data being corrupt or other issues.

Cheers!

On Fri, May 5, 2017 at 9:36 AM, Cristian Consonni <cristian@balist.es> wrote:
Hi,

On 01/05/2017 18:18, Nuria Ruiz wrote:
>> are there issues with using the data from the IA?
> Since that much predates our team record keeping of data issues the
> answer is that we do not know. Maybe someone in this list can chip in
> and we will add this answer to our dataset known issues which can be
> found here:
>
> https://wikitech.wikimedia.org/wiki/Analytics/Archive/Data/Pagecounts-raw#Events_and_known_problems_since_2014-03-01

I should add that there are a handful of files in October 2011[1] that
are incorrect, as they are not compressed and appear to be HTML pages
(also, they are 92KB files instead of being ~85 MB)

Again, the files from Internet Archive[2] seem to be OK.

Cristian

[1] https://dumps.wikimedia.org/other/pagecounts-raw/2011/2011-10/
Specifically, the following:
* pagecounts-20111008-180001.gz
* pagecounts-20111008-190000.gz
* pagecounts-20111008-200000.gz
* pagecounts-20111008-210000.gz
* pagecounts-20111008-220000.gz
[2]: https://archive.org/details/wikipedia_visitor_stats_201110

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



--
Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation