Thanks for this, Erik. This can be helpful for a variety of projects including and the next steps for this project.


On Wednesday, July 11, 2018, Erik Zachte <> wrote:
 Today I released two new json files [2][4].
Both complement visualization 'Wikipedia Views Visualized' [1] (aka
WiViVi), but both can be useful in other contexts as well.
1) File 'demographics_from_world_bank_for_wikimedia.json' [2] resulted from
harvesting World Bank API files.
It contains yearly figures for four metrics: (more could be added rather
- population counts,
- percentage internet users,
- percentage mobile subscriptions,
- GDP per capita.
The following static demographics charts on meta are also based on these
metrics: [3]
2) File 'datamaps-data.json' [4] contains the equivalent of 3 rather
complex (*) csv files which feed WiViVi. This brings together demographics
data and pageviews (by country, by region, and by language), and also adds
additional meta info. This json file is meant for external use, as it's
much easier to parse than the 3 csv files WiViVi uses itself [5].
(*) complex , as the csv files use a hierarchy based on nested delimiters
World Bank files have different formats (some csv, some json) and use a
variety of indexes (some use ISO 3166-1 alpha-2 codes, others ..-alpha-3).
Script 1) first does normalization, then data are aggregated, filtered,
Json file 1) replaces two csv files which up to now were filled from
Wikipedia pages [6][7].
Also, although Wikipedia lists nowadays also use World Bank data, this is
not consistently done, see [8][9].
[1] Viz:
[2] Json:
[3] Charts:
[4] Json:
[5] Syntax:
[6] Article:
[7] Article:
[8] Talk page: section 'Wikipedia vs Worldbank
population counts'
[9] Talk page: section 'Wikipedia vs Worldbank
internet percentages'
Wiki-research-l mailing list


Leila Zia
Senior Research Scientist
Wikimedia Foundation