I've been using the monthly page view summaries from pagecounts-ez. Now on
https://dumps.wikimedia.org/other/pagecounts-ez/ it says:
"NOTE: This dataset has had some problems and we are no longer generating new data,
since September 2020. We are phasing it out in favor of Pageviews Complete... When
it's finished we will announce it widely and explain how to migrate."
Is the announcement and explanation available somewhere. I'm having problems because
1. The "totals" files, such as
https://dumps.wikimedia.org/other/pagecounts-ez/merged/pagecounts-2020-08-v…,
which are of the order of 500Mb per month seem to have no equivalents in the new pageview
complete dump archives. The monthly files at
https://dumps.wikimedia.org/other/pageview_complete/monthly/2020/2020-08/ are 10x larger
(and I can't find any description of what the "automated" "user"
and "spider" files represent, although I can guess)
2. If I download (say)
https://dumps.wikimedia.org/other/pageview_complete/monthly/2020/2020-08/pa…,
and peek at the file using bzless, it seems to contain lots of binary characters: it's
not clear to me what the format is, or how to decode it. Is there any information online
to help me?
Thanks for any pointers that might help.