Hi Andrew, Thanks for adding wiki-research-l.
Sorry for the delay in getting back to you. I will have to admit that there
may be ways you can access this data that I'm not aware of. One way that I
know is for WMF to give you access to
. For that, we need a Formal Collaboration to be set up, as Andrew shared
earlier. The Research team's  capacity is currently unfortunately
limited with multiple existing priorities that we need to remain focused
on. As a result we cannot add a Formal Collaboration to support you.
While unfortunately we won't be able to support you now, I hope that
through efforts such as investments in differential privacy  we can
support you and other researchers with access to some of the data that is
I'm sorry that I can't support you at this time and I hope you understand.
Head of Research
On Mon, Oct 3, 2022 at 8:48 AM Andrew Otto <otto(a)wikimedia.org> wrote:
I don't think I have the authority to grant this approval as stated. We
can't make this data public, but if you find a research sponsor at the
Wikimedia Foundation, we can grant you access to private data internally if
you sign an NDA.
information on how to contact the WMF Research team and propose a formal
I am also CCing the wiki-research-l mailing list.
On Mon, Sep 26, 2022 at 3:27 AM Thiago Freitas <thiago(a)iiia.csic.es>
Dear Andrew Otto,
How are you?
I'm Thiago Freitas, a PhD student from the Artificial Intelligence
Research Institute (IIIA-CSIC) in Spain. We are working on detecting
hate speech in online communities and we are interested in the Wikipedia
use case, specifically the Revision Deletion data, which is not publicly
I am writing this email to "coordinate obtaining a comment of approval
on this task from the approving party" as described in the required task
in Phabricator. We have already published work on detecting norm
violations on Wikipedia
bit.ly/3t08QCg) with data available online, now we are looking to
further improve the quality of our research on hate speech detection
with this additional dataset. We will use this data to build different
machine learning models with experiments in several approaches. I would
be glad to provide further information about our work. Thank you so much
for your attention.
Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
To unsubscribe send an email to wiki-research-l-leave(a)lists.wikimedia.org