Hello Noah,
Thank you for reaching out to us :)
The reason for which we have not backfilled the "top pageview per country"
data is because, to secure privacy of our users, we use a filtering
mechanism to remove pages that have been seen by less than 1000 actors a
day, and that the data allowing us to do so is kept only for 90 days.
I have just created a task in our phabricator board for us to investigate
other filtering methods that could allow us to release historical data,
even if less detailed (
https://phabricator.wikimedia.org/T299627).
Sorry for not being able to help and best of luck for your studies :)
Joseph for the Data Engineering (ex-Analytics) team
On Tue, Jan 18, 2022 at 1:33 AM Noah Brunken Syrkis <nobr(a)itu.dk> wrote:
Hello,
I noticed that the public api for daily top viewed pages per country[1]
only goes back to Jan 1st, 2021. Could this be backfilled from other
datasets to 2015, without too much effort on Your part? The research team
encouraged me to ask here, when I spoke with them about my need for the
data—I'm a data science student at the IT University of Copenhagen doing a
thesis on predicting country level human value survey responses[2] based on
the top read Wikipedia pages in the given country.
Thanks!
Noah
[1]
https://wikimedia.org/api/rest_v1/#/Pageviews%20data/get_metrics_pageviews_…
[2]
http://www.europeansocialsurvey.org/downloadwizard/
_______________________________________________
Analytics mailing list -- analytics(a)lists.wikimedia.org
To unsubscribe send an email to analytics-leave(a)lists.wikimedia.org
--
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation