*Hey everybody,If you are an editor of the French, Italian or English
Wikipedia interested to contribute in building technologies for improving
missing citation detection in Wikipedia articles, please read on.As part of
our current work on verifiability
<https://meta.wikimedia.org/wiki/Knowledge_Integrity>, the Wikimedia
Foundation’s Research team <http://research.wikimedia.org> is studying ways
to use machine learning to flag unsourced statements needing a citation
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>.
If successful, this project will allow us to identify areas where
identifying high quality citations is particularly urgent or important.To
help with this project, we need to collect high-quality labeled data
regarding individual sentences: whether they need citations, and why. We
created a tool for this purpose and we would like to invite you to
participate in a pilot
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>.
The annotation task should be fun, short, and straightforward for
experienced Wikipedia editors.If you are interested in participating,
please proceed as follows: - Sign-up by (optionally) adding your name in
the sign-up page
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>.-
Go to http://labels.wmflabs.org/ui/enwiki/
<http://labels.wmflabs.org/ui/enwiki/>, login, and from 'Labeling
Unsourced Statements’, request one (or more) workset. Each workset takes
maximum 5 minutes to complete and contains 5 tasks. There is no minimum
number of worksets, but of course the more labels you provide, the better.-
For each task in a workset, the tool will show you an unsourced sentence in
an article and ask you to annotate it. You can then label the sentence as
needing an inline citation or not, and specify a reason for your choices. -
If you can't respond, please select 'skip'. If you can respond but you are
not 100% sure about your choice, please select 'Unsure'.If you have any
question/comment, please let us know by sending an email to
miriam(a)wikimedia.org <miriam(a)wikimedia.org> or leaving a message on the
talk page of the project
<https://meta.wikimedia.org/wiki/Research_talk:Identification_of_Unsourced_S…>.
We canrelatively easily adapt the tool if something needs to be
changed.Thank you for your time!Miriam and Dario*
Hey all,
(apologies for cross-posting)
We’re sharing a proposed program
<https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/CDP…>
for the Wikimedia Foundation’s upcoming fiscal year
<https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2018-2019/…>
(2018-19) and *would love to hear from you*. This plan builds extensively
on projects and initiatives driven by volunteer contributors and
organizations in the Wikimedia movement, so your input is critical.
Why a “knowledge integrity” program?
Increased global attention is directed at the problem of misinformation and
how media consumers are struggling to distinguish fact from fiction.
Meanwhile, thanks to the sources they cite, Wikimedia projects are uniquely
positioned as a reliable gateway to accessing quality information in the
broader knowledge ecosystem. How can we mobilize these citations as a
resource and turn them into a broader, linked infrastructure of trust to
serve the entire internet? Free knowledge grounds itself in verifiability
and transparent attribution policies. Let’s look at 4 data points as
motivating stories:
- Wikipedia sends tens of millions of people to external sources each
year. We want to conduct research to understand why and how readers leave
our site.
- The Internet Archive has fixed over 4 million dead links on Wikipedia.
We want to enable instantaneous archiving of every link on all Wikipedias
to ensure the long-term preservation of the sources Wikipedians cite.
- #1Lib1Ref reaches 6 million people on social media. We want to bring
#1Lib1Ref to Wikidata and more languages, spreading the message that
references improve quality.
- 33% of Wikidata items represent sources (journals, books, works). We
want to strengthen community efforts to build a high-quality, collaborative
database of all cited and citable sources.
A 5-year vision
Our 5-year vision for the Knowledge Integrity program is to establish
Wikimedia as the hub of a federated, trusted knowledge ecosystem. We plan
to get there by creating:
- A roadmap to a mature, technically and socially scalable, central
repository of sources.
- Developed network of partners and technical collaborators to
contribute to and reuse data about citations.
- Increased public awareness of Wikimedia’s vital role in information
literacy and fact-checking.
5 directions for 2018-2019
We have identified 5 levers of Knowledge Integrity: research,
infrastructure and tooling, access and preservation, outreach, and
awareness. Here’s what we want to do with each:
1. Continue to conduct research to understand how readers access sources
and how to help contributors improve citation quality.
2. Improve tools for linking information to external sources, catalogs,
and repositories.
3. Ensure resources cited across Wikimedia projects are accessible in
perpetuity.
4. Grow outreach and partnerships to scale community and technical
efforts to improve the structure and quality of citations.
5. Increase public awareness of the processes Wikimedians follow to
verify information and articulate a collective vision for a trustable web.
Who is involved?
The core teams involved in this proposal are:
- Wikimedia Foundation Technology’s Research Team
- Wikimedia Foundation Community Engagement’s Programs team (Wikipedia
Library)
- Wikimedia Deutschland Engineering’s Wikidata team
The initiative also spans across an ecosystem of possible partners
including the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations,
and Zotero. It is further made possible by funders including the Sloan,
Gordon and Betty Moore, and Simons Foundations who have been supporting the
WikiCite initiative to date.
How you can participate
You can read the fine details of our proposed year-1 plan, and provide your
feedback, on mediawiki.org: https://www.mediawiki.org/
wiki/Wikimedia_Technology/Annual_Plans/FY2019/CDP3:_Knowledge_Integrity
We’ve also created a brief introductory slidedeck about our motivation and
goals: https://commons.wikimedia.org/wiki/File:Knowledge_Integrity_
CDP_proposal_%E2%80%93_FY2018-19.pdf
WikiCite has laid the groundwork for many of these efforts. Read last
year’s report: https://commons.wikimedia.org/wiki/File:WikiCite_2017_
report.pdf
Recent initiatives like the just released citation dataset foreshadow the
work we want to do: https://medium.com/freely-sharing-the-sum-of-all-
knowledge/what-are-the-ten-most-cited-sources-on-
wikipedia-lets-ask-the-data-34071478785a
Lastly, this April we’re celebrating Open Citations Month; it’s right in
the spirit of Knowledge Integrity: https://blog.wikimedia.org/
2018/04/02/initiative-for-open-citations-birthday/
--
*Dario Taraborelli *Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>