Today we are announcing the first results of the collaboration between Wikimedia Research and Jigsaw on modeling personal attacks and other forms of harassment on English Wikipedia. We have released a corpus of 95M user and article talk page comments as well as over 1M human labels produced by 4000 crowd-workers for a set of 100k comments. Documentation on our methodology and future work can be found in our paper Ex Machina: Personal Attacks Seen at Scale (to appear at WWW2017) and on our project page on meta. If you are interested in contributing to the project, please get in touch via the project talk page. Another great way to get involved is to label a set of comment in the Wikilabels discussion quality campaign.
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics