As I mentioned in an e-mail last week, we are on the final stages of a large literature
review on scholarly research on Wikipedia. We have extracted and organized most of the
data and have published it to a Semantic MediaWiki wiki at http://wikilit.refarata.com
[Thanks to emijpr for inspiring the structure through WikiPapers.] As I indicated, this is
sort of an incubator site to finish up our data to prepare for publication, after which we
intend to export the data permanently to other sites like AcaWiki and WikiPapers.
Thanks for your responses to my inquiries; we have included abstracts, and the data is
dual-licensed as CC-BY-SA and ODC-ODbl (http://opendatacommons.org/licenses/odbl/summary/
[thanks, Dario, for the links!], except for copyrighted abstracts. We have submitted a
related presentation proposal for Wikimania 2012 at
We are asking the Wikipedia research community to please help us verify the accuracy of
our data extraction so far. Practically, if you could at least take a look at your own
publications and the publications you know well, that would be great. It's an open
wiki, so please make any corrections directly, even anonymously. (However, if you want us
to acknowledge your contributions, please create a user account and identify yourself on
your user page.) In particular, please help us with the following:
* Please correct any inaccuracies you see, or e-mail us at wikilit(a)okoli.org to notify us
* Please point out any peer-reviewed journal articles or PhD dissertations we have missed
that were published before July 2011; we will certainly add these. (After that, the
Wikimedia Research Newsletter began.)
* Please point out any other scholarly studies (especially conference articles and
significant non-peer-reviewed work) that you feel should definitely be analyzed in detail.
Although we have listed 1,500 conference papers
), our limited time and
resources only permits us to analyze a fraction of them in detail. So, please help us
highlight the most important ones that we have not analyzed in detail, with a brief
explanation of why they are particularly important.
* Please add any published scholarly studies about Wikipedia that we have left out,
regardless of peer review or publication type! Please add your own work! Our restrictions
in what we include are purely pragmatic due to time and resource limitations. However, if
you add a new article, please be sure to *complete as many input fields as possible*,
since we will generally exclude any article with incomplete data in our final analysis.
* Please suggest any data analysis or visualizations you would like to see as we
synthesize the data.
* Please give any other feedback or suggestion that can help us make this dataset more
useful to researchers! Send comments to wikilit(a)okoli.org.
The data is publicly available, but this is a beta release and there are probably a lot of
errors. We hope to have a stable and very clean dataset within a couple months, both from
community help and from our own internal quality control processes; we'll make another
announcement when we feel the dataset has reached "featured" quality. In
particular, please wait a bit before exporting the data to other research collection
websites and wikis until it is in a cleaner state; by then, we'll help make it
available in as many export formats as practical.
For the WikiLit project team: Arto Lanamäki, Mohamad Mehdi, Mostafa Mesgari, Finn Årup
Nielsen, Chitu Okoli