I’m glad to announce the release of an open-licensed corpus with 1.5M records from the Article Feedback v5 pilot. 

http://dx.doi.org/10.6084/m9.figshare.1277784

Thanks to everyone who helped make this happen, Fabrice in particular for shepherding this through.

Dario

This dataset contains the entire corpus of feedback submitted on the English, French and German Wikipedia during the Article Feedback v.5 pilot (AFT). [1] The Wikimedia Foundation ran the Article Feedback pilot for a year between March 2013 and March 2014. During the pilot, 1,549,842 feedback messages were collected across the three languages.

All feedback messages and their metadata (as described in this schema [2]) are available in this dataset, with the exception of messages that have been oversighted and/or deleted by the end of the pilot.
The corpus is released [3] under the following license:

• CC BY SA 3.0 for feedback messages
• CC0 for the associated metadata

Results from the pilot are discussed in: Halfaker, A., Keyes, O. and Taraborelli, D (2013). Making peripheral participation legitimate: Reader engagement experiments in Wikipedia. CSCW ’13 Proceedings of the 2013 Conference on Computer Supported Cooperative Work [4][5]

[1] https://www.mediawiki.org/wiki/Article_feedback/Version_5
[2] https://www.mediawiki.org/wiki/Article_feedback/Version_5/Technical_Design_Schema#aft_feedback
[3] https://wikimediafoundation.org/wiki/Feedback_data#Article_Feedback
[4] http://dx.doi.org/10.1145/2441776.2441872
[5] http://nitens.org/docs/cscw13.pdf