Hello!
I don't think I have the authority to grant this approval as stated. We can't make this data public, but if you find a research sponsor at the Wikimedia Foundation, we can grant you access to private data internally if you sign an NDA.
Please see https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations and https://meta.wikimedia.org/wiki/Research:FAQ#collaborations for more information on how to contact the WMF Research team and propose a formal collaboration.
I am also CCing the wiki-research-l mailing list.
Good luck! -Andrew Otto
On Mon, Sep 26, 2022 at 3:27 AM Thiago Freitas thiago@iiia.csic.es wrote:
Dear Andrew Otto, How are you?
I'm Thiago Freitas, a PhD student from the Artificial Intelligence Research Institute (IIIA-CSIC) in Spain. We are working on detecting hate speech in online communities and we are interested in the Wikipedia use case, specifically the Revision Deletion data, which is not publicly available.
I am writing this email to "coordinate obtaining a comment of approval on this task from the approving party" as described in the required task in Phabricator. We have already published work on detecting norm violations on Wikipedia (https://www.ifaamas.org/Proceedings/aamas2022/pdfs/p427.pdf and bit.ly/3t08QCg) with data available online, now we are looking to further improve the quality of our research on hate speech detection with this additional dataset. We will use this data to build different machine learning models with experiments in several approaches. I would be glad to provide further information about our work. Thank you so much for your attention.
Best regards, Thiago Freitas