--
James Gaunt (he/him)
Communications and Project Coordinator – Wikimedia Australia
james.gaunt@wikimedia.org.au+61 412 401 512
www.wikimedia.org.auHi all!Looks like Isaac and I had the same thought here. I also spent ~45 minutes hacking together a script that collects the top (up to) 500 pages for a given country from 1 December 2021 through 30 November 2022 using the WMF pageviews API. All of the datasets are relatively small and available for download and free use. Code for generating these lists is available on the WMF gitlab instance, and runs in ~3.5 hours on a normal Macbook, if anyone wants to download/fork it and try it on their own.There are only 135 ISO codes included in this set of files — I removed codes that WMF doesn't release data about or that have no data reported for the 365 day period in question. Let me know if you have any questions, and hope this helps!Hal_______________________________________________On Thu, Dec 8, 2022 at 8:18 AM Isaac Johnson <isaac@wikimedia.org> wrote:Romaine,Building on Chico's comment, I put together an example notebook of how to estimate such a list from the public data in case you're curious (I calculated it for January-November for Nigeria in the example). It's not a perfect approach in that it makes some assumptions and uses incomplete data but probably is close to what the actual list would be (details in the link). You'd likely want to use your knowledge of the region/languages to filter out pages like Special:Search and bot-driven views that slipped through into the data (like Cookie and Cleopatra in the example below).It makes use of these public Wikimedia resources:* PAWS infrastructure: https://wikitech.wikimedia.org/wiki/PAWS* Pageviews API: https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews* Python mwviews library for interacting with the pageviews API: https://github.com/mediawiki-utilities/python-mwviewsYou can read instructions for how to copy this notebook and run it for other countries here: https://wikitech.wikimedia.org/wiki/PAWS/Getting_started_with_PAWS#ForkBest,IsaacCopying the top-100 output for Nigeria below for ease of access:On Wed, Dec 7, 2022 at 7:41 PM Romaine Wiki <romaine.wiki@gmail.com> wrote:For some languages it is easy as a particular language is spoken in one country mainly. (Still there might be some local languages/dialects that are then not represented in the data.)For some other languages is is not easy to get the statistics of the most visited pages of a country as the language is spoken in multiple countries.If for example one country only has 3% of the population in comparison to another country with the same language, the language statistics are very biased. The larger country consumes so much data, that the data of the country with the smaller population is invisible. If we have no data for them, we let those unseen communities down.Romaine_______________________________________________Op wo 7 dec. 2022 om 18:21 schreef Jan Ainali <ainali.jan@gmail.com>:On Swedish Wikipedia we collect it on one page: https://sv.wikipedia.org/wiki/Wikipedia:Mest_visade_artiklar_2022Doing it per language is much easier than per country, as the data is publicly available.Best,Jan Ainali_______________________________________________Den ons 7 dec. 2022 kl 16:36 skrev Romaine Wiki <romaine.wiki@gmail.com>:Every year it reaches the headlines of the news: the top 10 or top 100 of most visited Google searches of the past year in my country. This I have seen in some other countries too.People are interested and with making this data public, something positive is said about Google (besides all the negatieve news about them during the rest of the year).This is something simple Wikimedia could do too: sharing this kind of data (*by country*) with the world. It would bring Wikipedia closer to the public, more positive awareness.Or otherwise making this data available to the local chapters so they can bring positive news about Wikipedia.Romaine_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/HXWUCYYRLL44LFIPZ6YXHLLDL7H63ZKD/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/INAMQFF4MPWHCFRXRDOSTNQH7S46Q3K5/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/AXYFSIPGTZMKKBVKHULBAK5BO5MUHSTH/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
--_______________________________________________Isaac Johnson (he/him/his) -- Senior Research Scientist -- Wikimedia Foundation
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/3KZTYDTR6D43BL2CZZUSGYDHIUBGASWU/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/ESRFRZFWB7JMI62IUVPRZTNFEQTM64BM/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org