Hello!
I don't think I have the authority to grant this approval as stated. We
can't make this data public, but if you find a research sponsor at the
Wikimedia Foundation, we can grant you access to private data internally if
you sign an NDA.
Please see
https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations and
https://meta.wikimedia.org/wiki/Research:FAQ#collaborations for more
information on how to contact the WMF Research team and propose a formal
collaboration.
I am also CCing the wiki-research-l mailing list.
Good luck!
-Andrew Otto
On Mon, Sep 26, 2022 at 3:27 AM Thiago Freitas <thiago(a)iiia.csic.es> wrote:
Dear Andrew Otto,
How are you?
I'm Thiago Freitas, a PhD student from the Artificial Intelligence
Research Institute (IIIA-CSIC) in Spain. We are working on detecting
hate speech in online communities and we are interested in the Wikipedia
use case, specifically the Revision Deletion data, which is not publicly
available.
I am writing this email to "coordinate obtaining a comment of approval
on this task from the approving party" as described in the required task
in Phabricator. We have already published work on detecting norm
violations on Wikipedia
(
https://www.ifaamas.org/Proceedings/aamas2022/pdfs/p427.pdf and
bit.ly/3t08QCg) with data available online, now we are looking to
further improve the quality of our research on hate speech detection
with this additional dataset. We will use this data to build different
machine learning models with experiments in several approaches. I would
be glad to provide further information about our work. Thank you so much
for your attention.
Best regards,
Thiago Freitas