Hi all,
I am trying to do some analysis of the data from the "Why the world reads Wikipedia" project, downloaded from here: https://figshare.com/articles/Why_the_World_Reads_Wikipedia/7579937/1
Unfortunately, it looks the page titles were written incorrectly in the 'responses' csv files, with non-ASCII characters are written as ?. This is making it impossible to examine anything other than enwiki.
From what I can gather, the data is actually missing:
Is there anywhere where I can get access to a corrected version of this dataset?
Regards, Dakota Killpack
Hi Dakota, Thanks for notifying us of this issue. I've uploaded a fixed dataset (the responses.zip part) -- the order and any other detail should be the same (except that titles should now be fixed). Let me know if you find any other issues with the data.
https://figshare.com/articles/Why_the_World_Reads_Wikipedia/7579937
Best, Isaac
On Sat, Aug 17, 2019 at 2:45 AM Dakota Killpack dakota@predata.com wrote:
Hi all,
I am trying to do some analysis of the data from the "Why the world reads Wikipedia" project, downloaded from here: https://figshare.com/articles/Why_the_World_Reads_Wikipedia/7579937/1
Unfortunately, it looks the page titles were written incorrectly in the 'responses' csv files, with non-ASCII characters are written as ?. This is making it impossible to examine anything other than enwiki.
From what I can gather, the data is actually missing: https://pastebin.com/bTh4BUV9
Is there anywhere where I can get access to a corrected version of this dataset?
Regards, Dakota Killpack _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org