*Hey everybody,If you are an editor of the French, Italian or English
Wikipedia interested to contribute in building technologies for improving
missing citation detection in Wikipedia articles, please read on.As part of
our current work on verifiability
<https://meta.wikimedia.org/wiki/Knowledge_Integrity>, the Wikimedia
Foundation’s Research team <http://research.wikimedia.org> is studying ways
to use machine learning to flag unsourced statements needing a citation
If successful, this project will allow us to identify areas where
identifying high quality citations is particularly urgent or important.To
help with this project, we need to collect high-quality labeled data
regarding individual sentences: whether they need citations, and why. We
created a tool for this purpose and we would like to invite you to
participate in a pilot
The annotation task should be fun, short, and straightforward for
experienced Wikipedia editors.If you are interested in participating,
please proceed as follows: - Sign-up by (optionally) adding your name in
the sign-up page
Go to http://labels.wmflabs.org/ui/enwiki/
<http://labels.wmflabs.org/ui/enwiki/>, login, and from 'Labeling
Unsourced Statements’, request one (or more) workset. Each workset takes
maximum 5 minutes to complete and contains 5 tasks. There is no minimum
number of worksets, but of course the more labels you provide, the better.-
For each task in a workset, the tool will show you an unsourced sentence in
an article and ask you to annotate it. You can then label the sentence as
needing an inline citation or not, and specify a reason for your choices. -
If you can't respond, please select 'skip'. If you can respond but you are
not 100% sure about your choice, please select 'Unsure'.If you have any
question/comment, please let us know by sending an email to
miriam(a)wikimedia.org <miriam(a)wikimedia.org> or leaving a message on the
talk page of the project
We canrelatively easily adapt the tool if something needs to be
changed.Thank you for your time!Miriam and Dario*
We’re sharing a proposed program for the Wikimedia Foundation’s upcoming
fiscal year and would love to hear from you. This plan builds extensively
on projects and initiatives driven by volunteer contributors and
organizations in the Wikimedia movement, so your input is critical.
Why a “knowledge integrity” program?
Increased global attention is directed at the problem of misinformation and
how media consumers are struggling to distinguish fact from fiction.
Meanwhile, thanks to the sources they cite, Wikimedia projects are uniquely
positioned as a reliable gateway to accessing quality information in the
broader knowledge ecosystem. How can we mobilize these citations as a
resource and turn them into a broader, linked infrastructure of trust to
serve the entire internet? Free knowledge grounds itself in verifiability
and transparent attribution policies. Let’s look at 4 data points as
Wikipedia sends tens of millions of people to external sources each
year. We want to conduct research to understand why and how readers leave
The Internet Archive has fixed over 4 million dead links on Wikipedia.
We want to enable instantaneous archiving of every link on all Wikipedias
to ensure the long-term preservation of the sources Wikipedians cite.
#1Lib1Ref reaches 6 million people on social media. We want to bring
#1Lib1Ref to Wikidata and more languages, spreading the message that
references improve quality.
33% of Wikidata items represent sources (journals, books, works). We
want to strengthen community efforts to build a high-quality, collaborative
database of all cited and citable sources.
A 5-year vision
Our 5-year vision for the Knowledge Integrity program is to establish Wikimedia
as the hub of a federated, trusted knowledge ecosystem. We plan to get
there by creating:
A roadmap to a mature, technically and socially scalable, central
repository of sources.
Developed network of partners and technical collaborators to contribute
to and reuse data about citations.
Increased public awareness of Wikimedia’s vital role in information
literacy and fact-checking.
5 directions for 2018-2019
We have identified 5 levers of Knowledge Integrity: research,
infrastructure and tooling, access and preservation, outreach, and
awareness. Here’s what we want to do with each:
Continue to conduct research to understand how readers access sources
and how to help contributors improve citation quality.
Improve tools for linking information to external sources, catalogs, and
Ensure resources cited across Wikimedia projects are accessible in
Grow outreach and partnerships to scale community and technical efforts
to improve the structure and quality of citations.
Increase public awareness of the processes Wikimedians follow to verify
information and articulate a collective vision for a trustable web.
Who is involved?
The core teams involved in this proposal are:
Wikimedia Foundation Technology’s Research Team
Wikimedia Foundation Community Engagement’s Programs team (Wikipedia
Wikimedia Deutschland Engineering’s Wikidata team
The initiative also spans across an ecosystem of possible partners
including the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations,
and Zotero. It is further made possible by funders including the Sloan,
Gordon and Betty Moore, and Simons Foundations who have been supporting the
WikiCite initiative to date.
How you can participate
You can read the fine details of our proposed year-1 plan on Meta:
We’ve created a brief introductory slidedeck about our motivation and goals:
WikiCite has laid the groundwork for many of these efforts. Read last
Recent initiatives like the just released citation dataset foreshadow the
work we want to do:
This April we’re celebrating Open Citations Month; it’s right in the spirit
of Knowledge Integrity:
Cheers! Jake Orlowitz