Hi all,
As part of our efforts to better serve the Wikimedia research community, we
are happy to share that we are collaborating with the Security team at WMF
to help prioritize the release of data that can be useful for your
research. The Security team is working to make more datasets privatized and
public to avoid the need for non-disclosure agreements. You can learn more
here: https://meta.wikimedia.org/wiki/Differential_privacy.
Over the next 12 months, the Security team plans to release 5 datasets:
-
country-language-pageview ongoing (end of 2022)
-
country-language-pageview historical (March 2023)
-
geo-aggregated grants data back to 2009 (Feb 2023)
-
geoeditors monthly (June 2023)
-
dataset informed by research community priorities identified in this
survey (second half of 2023)
The released datasets need to meet certain privacy requirements:
-
They can not include any natural language (e.g. specific search queries
or deletion logs) so as to avoid the release of personally identifiable
information;
-
They need to be sufficiently large (at least thousands of entries,
preferably more) so as to reduce noise;
-
The data can not be so sensitive that an individual user will be harmed
by disclosure of the data (e.g. IP addresses, content containing personally
identifying information).
We invite you to complete a brief survey
<https://docs.google.com/forms/d/e/1FAIpQLSe_LAt6V2Q1GUf3Z8lnt8uAOZnHTO5rNgF…>
to help us identify and prioritize the types of datasets that you would
find useful for your work. Results of this survey will inform the fifth
dataset, scheduled to be released in late 2023. This survey is conducted
via a third-party service, which may subject it to additional terms. For
more information on privacy and data-handling, see the survey privacy
statement:
https://foundation.wikimedia.org/wiki/Legal:Data_Release_Priorities_Survey_…
The survey will remain open until November 3, 2022. After that time,
members of the Research and Security teams will review the data and report
out about the suggestions that were received and how the work will proceed.
If you prefer to not respond via the Google form, you can email your
feedback to us or set up a time to discuss. You can also leave questions
and comments on the Talk page:
https://meta.wikimedia.org/wiki/Differential_privacy
Thanks for your help!
Emily Lescak, WMF Research team
Hal Triedman, WMF Security team
--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
Hi all,
We are excited to see increased awareness about the Wikimedia Research Fund
this year as compared to last year. We are receiving more inquiries from
prospective applicants, the number of views of the relevant pages is
significantly more than last year [1], and we are seeing good engagement in
office hours.
We are learning that there are quite a few researchers who have initial
ideas about projects that can benefit from input from existing Wikimedia
research, editor, organizer, or developer communities to strengthen their
proposals. Because we don't have one shared way to match or introduce
people to one another, we are starting to encourage people to start a
project on MetaWiki [2] and consider reaching out to wiki-research-l or
their relevant chapter or user group to seek input or look for
collaborators.
This means that the traffic for emails that come to this list between now
and December 16, the Research Fund's Stage I deadline [3], may increase.
The content of the emails may also ask for your input. Please offer input
to the extent that you have time and interest to help those who wish to
join our research community have a higher chance of success and a welcoming
experience.
The questions of how to welcome newcomers to our community and how to help
them be productive require dedicated attention. We hope that we can pick up
this topic with at least some of you in the coming year. :)
If you have questions or comments, please don't hesitate to reach out.
Best,
Emily
[1]
https://pageviews.wmcloud.org/?project=meta.wikimedia.org&platform=all-acce…
<https://pageviews.wmcloud.org/?project=meta.wikimedia.org&platform=all-acce…>
[2] https://meta.wikimedia.org/wiki/Research:New_project
[3]
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Tech…
--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
Hi everyone,
Wiki Workshop [1 <https://wikiworkshop.org/>] 2023 will be the 10th edition
of Wiki Workshop! \o/ In the spirit of research and experimentation, we
have decided to make some changes for this decade edition event. There are
some changes that we know about now, and some that are work in progress.
Below you can learn more about the high level changes we expect to
implement.
*Online or in-person?*
Based on the feedback that we have gathered from Wiki Workshop attendees
over the past few years, a survey of authors of the recent Wiki Workshops,
as well as data about the geographical and gender diversity of Wiki
Workshop attendees (disclosed optionally as part of the registration form,
and aggregated), *we have decided to offer Wiki Workshop 2023 as a fully
online event*.
Through the authors' survey we also learned that some authors who publish
in Wiki Workshop appreciated the in-person presence of the Wiki Workshop
community as part of the Web Conference [2 <https://www2023.thewebconf.org/>]
(formerly WWW). *We are exploring options to bring the Wikimedia
researchers who will attend TheWebConf 2023 in-person together while some
of us will be in Austin*. More details on this in early 2023.
*When*
We expect the workshop to take place some time in April-June 2023. We will
announce the exact date no later than the end of February 2023.
*Publishing and proceedings*
When we surveyed Wiki Workshop authors, those who responded were split
50-50 between whether it is important for them to have their workshop
submission as part of a proceedings. This allowed us to start considering
options other than the Companion Proceedings of WWW (the traditional venue
where a subset of Wiki Workshop papers were published in every year).
I'm very excited to share that we have found a new approach for publishing
Wiki Workshop papers that can allow us to experiment with new models and
keep the two groups of authors happy.
For the 2023 edition, *we will continue with the tradition of receiving
paper submissions for Wiki Workshop* (though we may change the submission
format/length for 2023)* and all accepted papers will appear in the
**corresponding
Wiki Workshop website*, similar to last year's. [3
<https://wikiworkshop.org/2022/#papers>] However, instead of accepting a
subset of the papers to appear in Proceedings of WWW, we are working with
the Editor in Chief of ACM Transactions on the Web (TWEB) [4
<https://dl.acm.org/journal/tweb>] to create a pathway for a subset of the
Wiki Workshop papers (likely after being extended) to be submitted for
review to *a special edition of ACM TWEB*.
There are a lot of details for us to work on to make the TWEB special
edition happen and that means this year you should expect to receive the
Call for Paper for Wiki Workshop some time in late January to middle of
February 2023 (instead of the usual December time-frame).
I am very excited about the opportunity for the work of the Wikimedia
research and Wiki Workshop community to be published as part of ACM
Transactions on the Web and I'm very grateful to Ryen White,
Editor-in-Chief of ACM TWEB, for being welcoming in exploring this idea and
offering a special edition space (details tbd).
*Other changes*
There are some other high level schedule changes that we may make for the
2023 edition. If you like to stay informed about these changes at a
granular level and over time, you're welcome to subscribe to the
Phabricator task where these changes will be tracked:
https://phabricator.wikimedia.org/T313530 .
We hope to be back with more updates for you in early 2023.
Best,
Leila, Bob and Emily
p.s. Please note that I didn't run the text of this email with Bob and
Emily (cc-ed). We have coordinated and discussed these changes among
ourselves, and they're welcome to add/update as they see fit.
[1] https://wikiworkshop.org/
[2] https://www2023.thewebconf.org/
[3] https://wikiworkshop.org/2022/#papers
[4] https://dl.acm.org/journal/tweb
--
Leila Zia
Head of Research
Wikimedia Foundation
Hello everyone,
The next Research Showcase will be live-streamed Wednesday, October 19, at
9:30 AM PST/16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1666197004>.
YouTube stream: https://www.youtube.com/watch?v=ML-ULyARpU4
Members of the Research team will collect questions on IRC at
#wikimedia-research and YouTube.
This month's presentation is a panel discussion celebrating Wikidata's 10th
birthday!
October 2022 marks the tenth anniversary of the launch of Wikidata (
www.wikidata.org). In ten years, this project has become the largest
community-driven free knowledge graph in the world, enabling a common
knowledge base for Wikimedia projects. The language-independent nature of
Wikidata has greatly improved the maintenance and consistency of knowledge
across Wikipedia language editions, fostering knowledge equity in
Wikimedia. In addition, since Wikidata is a collaborative project that can
be read and edited by humans and machines alike, it is also widely used in
third-party applications delivering knowledge as a service for all. The
Wikimedia Research community has devoted significant effort and resources
in studying the foundations, capabilities and applications of Wikidata,
from the complex requirements of representing real-world knowledge in a
multilingual environment to the needs to assess the quality of data and
sources in Wikidata. To learn more about the state of the art of Wikidata
and research challenges in the era of AI/ML, we will celebrate this tenth
anniversary with a panel that will bring together established
researchers/practitioners in this field.
The panel will be moderated by Denny Vrandečić (WMF) with panelists Lydia
Pintscher (WMDE), Elena Simperl (King's College London), Katherine Thornton
(Yale), and Markus Krötzsch (Technical University of Dresden).
You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
We hope you can join us!
Warm regards,
Emily, on behalf of the WMF Research team
--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
Hello!
I don't think I have the authority to grant this approval as stated. We
can't make this data public, but if you find a research sponsor at the
Wikimedia Foundation, we can grant you access to private data internally if
you sign an NDA.
Please see
https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations and
https://meta.wikimedia.org/wiki/Research:FAQ#collaborations for more
information on how to contact the WMF Research team and propose a formal
collaboration.
I am also CCing the wiki-research-l mailing list.
Good luck!
-Andrew Otto
On Mon, Sep 26, 2022 at 3:27 AM Thiago Freitas <thiago(a)iiia.csic.es> wrote:
> Dear Andrew Otto,
> How are you?
>
> I'm Thiago Freitas, a PhD student from the Artificial Intelligence
> Research Institute (IIIA-CSIC) in Spain. We are working on detecting
> hate speech in online communities and we are interested in the Wikipedia
> use case, specifically the Revision Deletion data, which is not publicly
> available.
>
> I am writing this email to "coordinate obtaining a comment of approval
> on this task from the approving party" as described in the required task
> in Phabricator. We have already published work on detecting norm
> violations on Wikipedia
> (https://www.ifaamas.org/Proceedings/aamas2022/pdfs/p427.pdf and
> bit.ly/3t08QCg) with data available online, now we are looking to
> further improve the quality of our research on hate speech detection
> with this additional dataset. We will use this data to build different
> machine learning models with experiments in several approaches. I would
> be glad to provide further information about our work. Thank you so much
> for your attention.
>
> Best regards,
> Thiago Freitas
>
Dear Coordinator,
Thank you for adding me in the Wikimedia research mailing list. I truly appreciate this. I would love to know more about how the platform works or what are expected of the members of this platform. If there is a link, could you please refer me to that? Thank you.
Kind regards,
Ngozi Perpetua Osuchukwu
Sent from Yahoo Mail on Android