Hi Dario&Jake,

Thanks for sharing the plan. Any possibility to include in the plan a system to archive all reference URLs and external identifiers linked from Wikidata?
https://phabricator.wikimedia.org/T143488

Additionally I think it would be interesting to have some research done on which references are DISPLAYED or CLICKED the most on several Wikipedias. We know already which sources are cited the most, but on which sources do users hover their mouse the most? Can we also identify which statements are involved? It could be used to expand them, improve them, or add more context.

Finally I believe it would be that a tool to assess the openness/accessibility of the sources of any given article could be really interesting.

Regards,
Micru


On Tue, Apr 17, 2018 at 2:32 AM, Dario Taraborelli <dtaraborelli@wikimedia.org> wrote:
Hey all,

(apologies for cross-posting)

We’re sharing a proposed program for the Wikimedia Foundation’s upcoming fiscal year (2018-19) and would love to hear from you. This plan builds extensively on projects and initiatives driven by volunteer contributors and organizations in the Wikimedia movement, so your input is critical.

Why a “knowledge integrity” program?

Increased global attention is directed at the problem of misinformation and how media consumers are struggling to distinguish fact from fiction. Meanwhile, thanks to the sources they cite, Wikimedia projects are uniquely positioned as a reliable gateway to accessing quality information in the broader knowledge ecosystem. How can we mobilize these citations as a resource and turn them into a broader, linked infrastructure of trust to serve the entire internet?  Free knowledge grounds itself in verifiability and transparent attribution policies. Let’s look at 4 data points as motivating stories:
  • Wikipedia sends tens of millions of people to external sources each year. We want to conduct research to understand why and how readers leave our site.
  • The Internet Archive has fixed over 4 million dead links on Wikipedia. We want to enable instantaneous archiving of every link on all Wikipedias to ensure the long-term preservation of the sources Wikipedians cite.
  • #1Lib1Ref reaches 6 million people on social media. We want to bring #1Lib1Ref to Wikidata and more languages, spreading the message that references improve quality.
  • 33% of Wikidata items represent sources (journals, books, works). We want to strengthen community efforts to build a high-quality, collaborative database of all cited and citable sources.
A 5-year vision

Our 5-year vision for the Knowledge Integrity program is to establish Wikimedia as the hub of a federated, trusted knowledge ecosystem. We plan to get there by creating:
  • A roadmap to a mature, technically and socially scalable, central repository of sources.
  • Developed network of partners and technical collaborators to contribute to and reuse data about citations.
  • Increased public awareness of Wikimedia’s vital role in information literacy and fact-checking.

5 directions for 2018-2019

We have identified 5 levers of Knowledge Integrity: research, infrastructure and tooling, access and preservation, outreach, and awareness. Here’s what we want to do with each:

  1. Continue to conduct research to understand how readers access sources and how to help contributors improve citation quality.
  2. Improve tools for linking information to external sources, catalogs, and repositories.
  3. Ensure resources cited across Wikimedia projects are accessible in perpetuity.
  4. Grow outreach and partnerships to scale community and technical efforts to improve the structure and quality of citations.
  5. Increase public awareness of the processes Wikimedians follow to verify information and articulate a collective vision for a trustable web.

Who is involved?

The core teams involved in this proposal are:
  • Wikimedia Foundation Technology’s Research Team
  • Wikimedia Foundation Community Engagement’s Programs team (Wikipedia Library)
  • Wikimedia Deutschland Engineering’s Wikidata team

The initiative also spans across an ecosystem of possible partners including the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations, and Zotero. It is further made possible by funders including the Sloan, Gordon and Betty Moore, and Simons Foundations who have been supporting the WikiCite initiative to date.

How you can participate

You can read the fine details of our proposed year-1 plan, and provide your feedback, on mediawiki.org: https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/CDP3:_Knowledge_Integrity

We’ve also created a brief introductory slidedeck about our motivation and goals: https://commons.wikimedia.org/wiki/File:Knowledge_Integrity_CDP_proposal_%E2%80%93_FY2018-19.pdf

WikiCite has laid the groundwork for many of these efforts. Read last year’s report: https://commons.wikimedia.org/wiki/File:WikiCite_2017_report.pdf

Recent initiatives like the just released citation dataset foreshadow the work we want to do: https://medium.com/freely-sharing-the-sum-of-all-knowledge/what-are-the-ten-most-cited-sources-on-wikipedia-lets-ask-the-data-34071478785a

Lastly, this April we’re celebrating Open Citations Month; it’s right in the spirit of Knowledge Integrity: https://blog.wikimedia.org/2018/04/02/initiative-for-open-citations-birthday/


--

Dario Taraborelli  Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter 


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




--
Etiamsi omnes, ego non