Hey all,
it's been a while since we posted a big update on everything that's been
happening around WikiCite <https://meta.wikimedia.org/wiki/WikiCite> on
this list. A few people suggested a regular (monthly or quarterly)
newsletter would be a good way to keep everyone posted on the latest
developments, so I am starting one.
Below are the main highlights for September. I'm cross-posting this to the
Wikidata mailing list and hosting the full-text of the WikiCite newsletter
<https://meta.wikimedia.org/wiki/WikiCite/Newsletter> on Meta.
Please add anything I may have missed on the wiki. Copying the full content
below, future updates will only include a link to save bandwidth and ugly
HTML email.
Best,
Dario
September 2016MilestonesAll English Wikipedia references citing PMCIDs
Thanks to Daniel Mietchen
<https://meta.wikimedia.org/wiki/User:Daniel_Mietchen>, all references used
in English Wikipedia with a *PubMed Central identifier* (P932
<https://www.wikidata.org/wiki/Property:P932>), based on a dataset
<https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_in_Wikipedia>
produced
byAaron Hafaker <https://meta.wikimedia.org/wiki/User:EpochFail> using the
mwcites <https://github.com/mediawiki-utilities/python-mwcites> library,
have been created in Wikidata. As of today, there are over 110,000 items
<https://tools.wmflabs.org/sqid/#/view?id=P932&lang=en> using this property.
Half a million citations using P2860
James Hare <https://meta.wikimedia.org/wiki/User:Harej> has been working on
importing open access review papers published in the last 5 years as well
as their citation graph. These review papers are not only critical to
Wikimedia projects, as sources of citations for Wikipedia articles and
statements, but they also open license their contents, which will allow
semi-automated statement extraction via text and data mining strategies. As
part of this project, the property *cites* (P2860
<https://www.wikidata.org/wiki/Property:P2860>) created during WikiCite
2016 has been used in over half a million statements representing citations
from one paper to another. While this is a tiny fraction of the entire
citation graph, it's a great way of making data available to Wikimedia
volunteers for crosslinking statements, sources and the works they cite.
New properties
The *Crossref funder ID* property (P3153
<https://www.wikidata.org/wiki/Property:P3153>) can now be used to
represent data on the funders of particular works (when available), which
will allow to perform novel analyses on sources for Wikidata statements as
a function of particular funders.
The *uses property* property (P3176
<https://www.wikidata.org/wiki/Property:P3176>), which Finn Årup Nielsen
conveniently dubbed
<https://twitter.com/fnielsen/status/780704357335625728> the
"selfie property"*,* can now be used to identify works that mention
specific Wikidata properties.
EventsWikiCite 2017
Our original plans to host the 2017 edition
<https://meta.wikimedia.org/wiki/WikiCite_2017> of WikiCite in San
Francisco in the week of January 16, 2017 (after the Wikimedia Developer
Summit) failed due to a major, Salesforce-style conference happening in
that week, which will bring tens of thousands of delegates to the city. The
WMF Travel team blacklisted that week for hosting events or meetings in SF,
since hotel rates will go through the roof. We're now looking at
alternative locations and dates in FY-Q4 (April-June 2017) in Europe, most
likely Berlin (like this year
<https://meta.wikimedia.org/wiki/WikiCite_2016>), or piggybacking on the 2017
Wikimedia Hackathon
<https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2017> in Vienna (May
19-21, 2017), which will give us access to a large number of volunteers as
well as WMF and WMDE developers.
DocumentationWikiCite 2016 Report
A draft report <https://meta.wikimedia.org/wiki/WikiCite_2016/Report>
from WikiCite
2016 <https://meta.wikimedia.org/wiki/WikiCite_2016> is available on Meta.
It will be closed in the coming days with the additional information
required by the funders of the event.
Book metadata modeling proposal
Chiara Storti <https://meta.wikimedia.org/wiki/User:Nonoranonqui> and Andrea
Zanni <https://meta.wikimedia.org/wiki/User:Aubrey> posted a proposal
<https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Books#Wikiproject_Books_2.0>
with
examples to address in a pragmatic way the complex issues surrounding
metadata modeling for books. If you're interested in the topic, please
chime in.
Outreach
[image: File:2016 VIVO Keynote - Dario Taraborelli.webm]
<https://upload.wikimedia.org/wikipedia/commons/8/84/2016_VIVO_Keynote_-_Dario_Taraborelli.webm>
<https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm>
*Verifiable, linked open knowledge that anyone can edit* (VIVO '16)
[image: File:Wikidata - Verifiable, Linked Open Knowledge that Anyone Can
Edit - frontiers092316 1840.webm]
<https://upload.wikimedia.org/wikipedia/commons/b/be/Wikidata_-_Verifiable%2C_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm>
<https://meta.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm>
September 23, 2016 presentation at the *NIH Frontiers in Data Science*lecture
series.
WikiCite, Wikidata and Open Access publishing
On September 21, Dario Taraborelli
<https://meta.wikimedia.org/wiki/User:DarTar> gave an invited presentation (
slides <https://doi.org/10.6084/m9.figshare.3956238.v1>) on WikiCite
in the*Technology
and Innovation* panel at the *8th Annual Conference of the Open Access
Publisher Society*(COASP 2016
<http://oaspa.org/conference/coasp-2016-program/>) in Arlington, VA. The
presentation triggered a discussion on the availability of open citation
data. In collaboration with Jennifer Lin (*Crossref*) we discovered that
out of 999 publishers already depositing citation data to Crossref, only 28
(3%) make this data open
<https://twitter.com/Wikicite/status/778981659836284929>. We urged
publishers <https://twitter.com/ReaderMeter/status/778982308237938688>,
particularly Open Access publishers and OASPA members, to release this data
that's critical to initiatives such as WikiCite.
Linking sources and expert curation in Wikidata: NIH lecture
On September 23, Dario Taraborelli
<https://meta.wikimedia.org/wiki/User:DarTar> also gave a longer
presentation at the National Institutes of Health
<https://en.wikipedia.org/wiki/NIH>(NIH) in Bethesda, MD, on September 23,
mostly focused on the integration of expert-curated statements (such as
those created by members of the Gene Wiki project
<https://en.wikipedia.org/wiki/Gene_Wiki>) and source metadata in Wikidata,
as part of the *NIH Frontiers in Data Science lecture series
<https://datascience.nih.gov/community/datascience-at-nih/frontiers>*. (
video
<https://commons.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm>
, slides <https://dx.doi.org/10.6084/m9.figshare.3850821.v1>) This is a
slightly modified version of the VIVO '16 closing keynote
<https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm>,
targeted at the biomedical science community.
So what can we use WikiCite for?
Finn Årup Nielsen <https://meta.wikimedia.org/wiki/User:Fnielsen> wrote a blog
post
<https://finnaarupnielsen.wordpress.com/2016/09/21/so-what-can-we-use-wikicite-for/>
showcasing
different ways in which a repository of source metadata could be used.
WikiCite at WMF Monthly Metrics
On September 29, a short retrospective on WikiCite was presented
during the September
2016 Wikimedia Monthly Activity and Metrics Meeting
<https://meta.wikimedia.org/wiki/Wikimedia_Foundation_metrics_and_activities_meetings/2016-09>
(livestream <https://www.youtube.com/watch?v=_grZDl3TJFc&t=1870>)
Grant proposals
Three proposals closely-related to WikiCite are applying for funding
through Wikimedia Grants
<https://meta.wikimedia.org/wiki/Grants:Project/Browse_applications>:
WikiFactMine
WikiFactMine <https://meta.wikimedia.org/wiki/Grants:Project/WikiFactMine> is
a proposal by the ContentMine team to harvest the scientific literature for
facts and recommend them for inclusion in Wikidata.
Librarybase
Librarybase
<https://meta.wikimedia.org/wiki/Grants:Project/Harej/Librarybase:_an_online_reference_library>
is
a proposal to build an "online reference library" for WIkimedia
contributors, leveraging Wikidata.
StrepHit
The StrepHit
<https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal>
team
submitted a grant renewal application to support semi-automated reference
recommendation for Wikidata statements.
Data releasesFirst release of the Open Citation Corpus
The OpenCitations project <http://opencitations.net/> announced the first
release <https://twitter.com/opencitations/status/780398561607442434> of
the Open Citation Corpus <http://opencitations.net/corpus>, an "open
repository of scholarly citation data made available under a Creative
Commons public domain dedication (CC0), which provides accurate
bibliographic references harvested from the scholarly literature that
others may freely build upon, enhance and reuse for any purpose, without
restriction under copyright or database law." The OpenCitation project uses
provenance and SPARQL for tracking changes in the data
<http://rawgit.com/essepuntato/opencitations/master/paper/occ-driftalod2016.html>
.
Data on DOI citations in Wikipedia from Crossref
Crossref <https://en.wikipedia.org/wiki/Crossref> recently announced
<https://twitter.com/Wikicite/status/780763142737453056> the beta version
of the Crossref Event Data user guide <http://eventdata.crossref.org/guide/>,
which provides information on mentions of Digital Object Identifiers
<https://en.wikipedia.org/wiki/Digital_Object_Identifiers> (DOI) across
non-scholarly sources. The guide includes a detailed overview
<http://eventdata.crossref.org/guide/#wikipedia> of how the system collects
and stores DOI citations from Wikimedia projects, and how this data can be
programmatically retrieved via the Crossref APIs.
Code releasesConverting Wikidata entries to BibTeX
ContentMine fellow Lars Willghagen announced a tool
<https://larsgw.blogspot.nl/2016/09/citationjs-on-command-line.html> combining
Citation.js with Node.js, which allows, among other things, to convert a
list of bibliographic entries stored as Wikidata items into a BibTex file.
ResearchFinding news citations for Wikipedia
Besnik Fetahu <http://www.l3s.de/~fetahu/> (Leibniz University of Hannover)
presented his research on news citation recommendations for Wikipedia at
the Wikimedia Research showcase
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#September_2016>
(slides
<http://www.slideshare.net/BesnikFetahu/finding-news-citations-for-wikipedia>
, video <https://www.youtube.com/watch?v=fTDkVeqjw80>). In his own words,
"in this work we address the problem of finding and updating news citations
for statements in entity pages. We propose a two-stage supervised approach
for this problem. In the first step, we construct a classifier to find out
whether statements need a news citation or other kinds of citations (web,
book, journal, etc.). In the second step, we develop a news citation
algorithm for Wikipedia statements, which recommends appropriate citations
from a given news collection."
DBpedia Citation Challenge
Krzysztof Węcel <http://kie.ue.poznan.pl/en/member/krzysztof-wecel> (Poznań
University of Economics and Business
<https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Economics_and_Business>)
presented his research (slides
<http://www.slideshare.net/KrzysztofWecel/dbpedia-citation-challenge-not-only-polish-citations-in-wikipedia-analysis-comparison-directions>)
in response to the DBpedia Citations and References Challenge
<http://wiki.dbpedia.org/blog/dbpedia-citations-references-challenge>,
analyzing content in Belarusian, English, French, German, Polish, Russian,
Ukrainian and showing how citation analysis can improve the modeling of
quality of Wikipedia articles.