Hey all,
it's been a while since we posted a big update on everything that's been happening around WikiCite https://meta.wikimedia.org/wiki/WikiCite on this list. A few people suggested a regular (monthly or quarterly) newsletter would be a good way to keep everyone posted on the latest developments, so I am starting one.
Below are the main highlights for September. I'm cross-posting this to the Wikidata mailing list and hosting the full-text of the WikiCite newsletter https://meta.wikimedia.org/wiki/WikiCite/Newsletter on Meta.
Please add anything I may have missed on the wiki. Copying the full content below, future updates will only include a link to save bandwidth and ugly HTML email.
Best, Dario
September 2016MilestonesAll English Wikipedia references citing PMCIDs
Thanks to Daniel Mietchen https://meta.wikimedia.org/wiki/User:Daniel_Mietchen, all references used in English Wikipedia with a *PubMed Central identifier* (P932 https://www.wikidata.org/wiki/Property:P932), based on a dataset https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_in_Wikipedia produced byAaron Hafaker https://meta.wikimedia.org/wiki/User:EpochFail using the mwcites https://github.com/mediawiki-utilities/python-mwcites library, have been created in Wikidata. As of today, there are over 110,000 items https://tools.wmflabs.org/sqid/#/view?id=P932&lang=en using this property. Half a million citations using P2860
James Hare https://meta.wikimedia.org/wiki/User:Harej has been working on importing open access review papers published in the last 5 years as well as their citation graph. These review papers are not only critical to Wikimedia projects, as sources of citations for Wikipedia articles and statements, but they also open license their contents, which will allow semi-automated statement extraction via text and data mining strategies. As part of this project, the property *cites* (P2860 https://www.wikidata.org/wiki/Property:P2860) created during WikiCite 2016 has been used in over half a million statements representing citations from one paper to another. While this is a tiny fraction of the entire citation graph, it's a great way of making data available to Wikimedia volunteers for crosslinking statements, sources and the works they cite. New properties
The *Crossref funder ID* property (P3153 https://www.wikidata.org/wiki/Property:P3153) can now be used to represent data on the funders of particular works (when available), which will allow to perform novel analyses on sources for Wikidata statements as a function of particular funders.
The *uses property* property (P3176 https://www.wikidata.org/wiki/Property:P3176), which Finn Årup Nielsen conveniently dubbed https://twitter.com/fnielsen/status/780704357335625728 the "selfie property"*,* can now be used to identify works that mention specific Wikidata properties. EventsWikiCite 2017
Our original plans to host the 2017 edition https://meta.wikimedia.org/wiki/WikiCite_2017 of WikiCite in San Francisco in the week of January 16, 2017 (after the Wikimedia Developer Summit) failed due to a major, Salesforce-style conference happening in that week, which will bring tens of thousands of delegates to the city. The WMF Travel team blacklisted that week for hosting events or meetings in SF, since hotel rates will go through the roof. We're now looking at alternative locations and dates in FY-Q4 (April-June 2017) in Europe, most likely Berlin (like this year https://meta.wikimedia.org/wiki/WikiCite_2016), or piggybacking on the 2017 Wikimedia Hackathon https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2017 in Vienna (May 19-21, 2017), which will give us access to a large number of volunteers as well as WMF and WMDE developers. DocumentationWikiCite 2016 Report
A draft report https://meta.wikimedia.org/wiki/WikiCite_2016/Report from WikiCite 2016 https://meta.wikimedia.org/wiki/WikiCite_2016 is available on Meta. It will be closed in the coming days with the additional information required by the funders of the event. Book metadata modeling proposal
Chiara Storti https://meta.wikimedia.org/wiki/User:Nonoranonqui and Andrea Zanni https://meta.wikimedia.org/wiki/User:Aubrey posted a proposal https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Books#Wikiproject_Books_2.0 with examples to address in a pragmatic way the complex issues surrounding metadata modeling for books. If you're interested in the topic, please chime in. Outreach [image: File:2016 VIVO Keynote - Dario Taraborelli.webm] https://upload.wikimedia.org/wikipedia/commons/8/84/2016_VIVO_Keynote_-_Dario_Taraborelli.webm https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm *Verifiable, linked open knowledge that anyone can edit* (VIVO '16) [image: File:Wikidata - Verifiable, Linked Open Knowledge that Anyone Can Edit - frontiers092316 1840.webm] https://upload.wikimedia.org/wikipedia/commons/b/be/Wikidata_-_Verifiable%2C_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm https://meta.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm September 23, 2016 presentation at the *NIH Frontiers in Data Science*lecture series. WikiCite, Wikidata and Open Access publishing
On September 21, Dario Taraborelli https://meta.wikimedia.org/wiki/User:DarTar gave an invited presentation ( slides https://doi.org/10.6084/m9.figshare.3956238.v1) on WikiCite in the*Technology and Innovation* panel at the *8th Annual Conference of the Open Access Publisher Society*(COASP 2016 http://oaspa.org/conference/coasp-2016-program/) in Arlington, VA. The presentation triggered a discussion on the availability of open citation data. In collaboration with Jennifer Lin (*Crossref*) we discovered that out of 999 publishers already depositing citation data to Crossref, only 28 (3%) make this data open https://twitter.com/Wikicite/status/778981659836284929. We urged publishers https://twitter.com/ReaderMeter/status/778982308237938688, particularly Open Access publishers and OASPA members, to release this data that's critical to initiatives such as WikiCite. Linking sources and expert curation in Wikidata: NIH lecture
On September 23, Dario Taraborelli https://meta.wikimedia.org/wiki/User:DarTar also gave a longer presentation at the National Institutes of Health https://en.wikipedia.org/wiki/NIH(NIH) in Bethesda, MD, on September 23, mostly focused on the integration of expert-curated statements (such as those created by members of the Gene Wiki project https://en.wikipedia.org/wiki/Gene_Wiki) and source metadata in Wikidata, as part of the *NIH Frontiers in Data Science lecture series https://datascience.nih.gov/community/datascience-at-nih/frontiers*. ( video https://commons.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm , slides https://dx.doi.org/10.6084/m9.figshare.3850821.v1) This is a slightly modified version of the VIVO '16 closing keynote https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm, targeted at the biomedical science community. So what can we use WikiCite for?
Finn Årup Nielsen https://meta.wikimedia.org/wiki/User:Fnielsen wrote a blog post https://finnaarupnielsen.wordpress.com/2016/09/21/so-what-can-we-use-wikicite-for/ showcasing different ways in which a repository of source metadata could be used. WikiCite at WMF Monthly Metrics
On September 29, a short retrospective on WikiCite was presented during the September 2016 Wikimedia Monthly Activity and Metrics Meeting https://meta.wikimedia.org/wiki/Wikimedia_Foundation_metrics_and_activities_meetings/2016-09 (livestream https://www.youtube.com/watch?v=_grZDl3TJFc&t=1870) Grant proposals
Three proposals closely-related to WikiCite are applying for funding through Wikimedia Grants https://meta.wikimedia.org/wiki/Grants:Project/Browse_applications: WikiFactMine
WikiFactMine https://meta.wikimedia.org/wiki/Grants:Project/WikiFactMine is a proposal by the ContentMine team to harvest the scientific literature for facts and recommend them for inclusion in Wikidata. Librarybase
Librarybase https://meta.wikimedia.org/wiki/Grants:Project/Harej/Librarybase:_an_online_reference_library is a proposal to build an "online reference library" for WIkimedia contributors, leveraging Wikidata. StrepHit
The StrepHit https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal team submitted a grant renewal application to support semi-automated reference recommendation for Wikidata statements. Data releasesFirst release of the Open Citation Corpus
The OpenCitations project http://opencitations.net/ announced the first release https://twitter.com/opencitations/status/780398561607442434 of the Open Citation Corpus http://opencitations.net/corpus, an "open repository of scholarly citation data made available under a Creative Commons public domain dedication (CC0), which provides accurate bibliographic references harvested from the scholarly literature that others may freely build upon, enhance and reuse for any purpose, without restriction under copyright or database law." The OpenCitation project uses provenance and SPARQL for tracking changes in the data http://rawgit.com/essepuntato/opencitations/master/paper/occ-driftalod2016.html . Data on DOI citations in Wikipedia from Crossref
Crossref https://en.wikipedia.org/wiki/Crossref recently announced https://twitter.com/Wikicite/status/780763142737453056 the beta version of the Crossref Event Data user guide http://eventdata.crossref.org/guide/, which provides information on mentions of Digital Object Identifiers https://en.wikipedia.org/wiki/Digital_Object_Identifiers (DOI) across non-scholarly sources. The guide includes a detailed overview http://eventdata.crossref.org/guide/#wikipedia of how the system collects and stores DOI citations from Wikimedia projects, and how this data can be programmatically retrieved via the Crossref APIs. Code releasesConverting Wikidata entries to BibTeX
ContentMine fellow Lars Willghagen announced a tool https://larsgw.blogspot.nl/2016/09/citationjs-on-command-line.html combining Citation.js with Node.js, which allows, among other things, to convert a list of bibliographic entries stored as Wikidata items into a BibTex file. ResearchFinding news citations for Wikipedia
Besnik Fetahu http://www.l3s.de/~fetahu/ (Leibniz University of Hannover) presented his research on news citation recommendations for Wikipedia at the Wikimedia Research showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#September_2016 (slides http://www.slideshare.net/BesnikFetahu/finding-news-citations-for-wikipedia , video https://www.youtube.com/watch?v=fTDkVeqjw80). In his own words, "in this work we address the problem of finding and updating news citations for statements in entity pages. We propose a two-stage supervised approach for this problem. In the first step, we construct a classifier to find out whether statements need a news citation or other kinds of citations (web, book, journal, etc.). In the second step, we develop a news citation algorithm for Wikipedia statements, which recommends appropriate citations from a given news collection." DBpedia Citation Challenge
Krzysztof Węcel http://kie.ue.poznan.pl/en/member/krzysztof-wecel (Poznań University of Economics and Business https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Economics_and_Business) presented his research (slides http://www.slideshare.net/KrzysztofWecel/dbpedia-citation-challenge-not-only-polish-citations-in-wikipedia-analysis-comparison-directions) in response to the DBpedia Citations and References Challenge http://wiki.dbpedia.org/blog/dbpedia-citations-references-challenge, analyzing content in Belarusian, English, French, German, Polish, Russian, Ukrainian and showing how citation analysis can improve the modeling of quality of Wikipedia articles.
Thanks a lot for your effort, Dario! I will add my contribution about the primary sources tool. Cheers,
Marco
On 9/30/16 00:49, Dario Taraborelli wrote:
Hey all,
it's been a while since we posted a big update on everything that's been happening around WikiCite https://meta.wikimedia.org/wiki/WikiCite on this list. A few people suggested a regular (monthly or quarterly) newsletter would be a good way to keep everyone posted on the latest developments, so I am starting one.
Below are the main highlights for September. I'm cross-posting this to the Wikidata mailing list and hosting the full-text of the WikiCite newsletter https://meta.wikimedia.org/wiki/WikiCite/Newsletter on Meta.
Please add anything I may have missed on the wiki. Copying the full content below, future updates will only include a link to save bandwidth and ugly HTML email.
Best, Dario
September 2016 Milestones All English Wikipedia references citing PMCIDs
Thanks to Daniel Mietchen https://meta.wikimedia.org/wiki/User:Daniel_Mietchen, all references used in English Wikipedia with a /PubMed Central identifier/ (P932 https://www.wikidata.org/wiki/Property:P932), based on a dataset https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_in_Wikipedia produced byAaron Hafaker https://meta.wikimedia.org/wiki/User:EpochFail using the mwcites https://github.com/mediawiki-utilities/python-mwcites library, have been created in Wikidata. As of today, there are over 110,000 items https://tools.wmflabs.org/sqid/#/view?id=P932&lang=en using this property.
Half a million citations using P2860
James Hare https://meta.wikimedia.org/wiki/User:Harej has been working on importing open access review papers published in the last 5 years as well as their citation graph. These review papers are not only critical to Wikimedia projects, as sources of citations for Wikipedia articles and statements, but they also open license their contents, which will allow semi-automated statement extraction via text and data mining strategies. As part of this project, the property /cites/ (P2860 https://www.wikidata.org/wiki/Property:P2860) created during WikiCite 2016 has been used in over half a million statements representing citations from one paper to another. While this is a tiny fraction of the entire citation graph, it's a great way of making data available to Wikimedia volunteers for crosslinking statements, sources and the works they cite.
New properties
The /Crossref funder ID/ property (P3153 https://www.wikidata.org/wiki/Property:P3153) can now be used to represent data on the funders of particular works (when available), which will allow to perform novel analyses on sources for Wikidata statements as a function of particular funders.
The /uses property/ property (P3176 https://www.wikidata.org/wiki/Property:P3176), which Finn Årup Nielsen conveniently dubbed https://twitter.com/fnielsen/status/780704357335625728 the "selfie property"/,/ can now be used to identify works that mention specific Wikidata properties.
Events WikiCite 2017
Our original plans to host the 2017 edition https://meta.wikimedia.org/wiki/WikiCite_2017 of WikiCite in San Francisco in the week of January 16, 2017 (after the Wikimedia Developer Summit) failed due to a major, Salesforce-style conference happening in that week, which will bring tens of thousands of delegates to the city. The WMF Travel team blacklisted that week for hosting events or meetings in SF, since hotel rates will go through the roof. We're now looking at alternative locations and dates in FY-Q4 (April-June 2017) in Europe, most likely Berlin (like this year https://meta.wikimedia.org/wiki/WikiCite_2016), or piggybacking on the 2017 Wikimedia Hackathon https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2017 in Vienna (May 19-21, 2017), which will give us access to a large number of volunteers as well as WMF and WMDE developers.
Documentation WikiCite 2016 Report
A draft report https://meta.wikimedia.org/wiki/WikiCite_2016/Report from WikiCite 2016 https://meta.wikimedia.org/wiki/WikiCite_2016 is available on Meta. It will be closed in the coming days with the additional information required by the funders of the event.
Book metadata modeling proposal
Chiara Storti https://meta.wikimedia.org/wiki/User:Nonoranonqui and Andrea Zanni https://meta.wikimedia.org/wiki/User:Aubrey posted a proposal https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Books#Wikiproject_Books_2.0 with examples to address in a pragmatic way the complex issues surrounding metadata modeling for books. If you're interested in the topic, please chime in.
Outreach
File:2016 VIVO Keynote - Dario Taraborelli.webmhttps://upload.wikimedia.org/wikipedia/commons/8/84/2016_VIVO_Keynote_-_Dario_Taraborelli.webm https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm /Verifiable, linked open knowledge that anyone can edit/ (VIVO '16) File:Wikidata - Verifiable, Linked Open Knowledge that Anyone Can Edit - frontiers092316 1840.webmhttps://upload.wikimedia.org/wikipedia/commons/b/be/Wikidata_-_Verifiable%2C_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm https://meta.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm September 23, 2016 presentation at the /NIH Frontiers in Data Science/lecture series.
WikiCite, Wikidata and Open Access publishing
On September 21, Dario Taraborelli https://meta.wikimedia.org/wiki/User:DarTar gave an invited presentation (slides https://doi.org/10.6084/m9.figshare.3956238.v1) on WikiCite in the/Technology and Innovation/ panel at the /8th Annual Conference of the Open Access Publisher Society/(COASP 2016 http://oaspa.org/conference/coasp-2016-program/) in Arlington, VA. The presentation triggered a discussion on the availability of open citation data. In collaboration with Jennifer Lin (/Crossref/) we discovered that out of 999 publishers already depositing citation data to Crossref, only 28 (3%) make this data open https://twitter.com/Wikicite/status/778981659836284929. We urged publishers https://twitter.com/ReaderMeter/status/778982308237938688, particularly Open Access publishers and OASPA members, to release this data that's critical to initiatives such as WikiCite.
Linking sources and expert curation in Wikidata: NIH lecture
On September 23, Dario Taraborelli https://meta.wikimedia.org/wiki/User:DarTar also gave a longer presentation at the National Institutes of Health https://en.wikipedia.org/wiki/NIH(NIH) in Bethesda, MD, on September 23, mostly focused on the integration of expert-curated statements (such as those created by members of the Gene Wiki project https://en.wikipedia.org/wiki/Gene_Wiki) and source metadata in Wikidata, as part of the /NIH Frontiers in Data Science lecture series https://datascience.nih.gov/community/datascience-at-nih/frontiers/. (video https://commons.wikimedia.org/wiki/File:Wikidata_-_Verifiable,_Linked_Open_Knowledge_that_Anyone_Can_Edit_-_frontiers092316_1840.webm, slides https://dx.doi.org/10.6084/m9.figshare.3850821.v1) This is a slightly modified version of the VIVO '16 closing keynote https://meta.wikimedia.org/wiki/File:2016_VIVO_Keynote_-_Dario_Taraborelli.webm, targeted at the biomedical science community.
So what can we use WikiCite for?
Finn Årup Nielsen https://meta.wikimedia.org/wiki/User:Fnielsen wrote a blog post https://finnaarupnielsen.wordpress.com/2016/09/21/so-what-can-we-use-wikicite-for/ showcasing different ways in which a repository of source metadata could be used.
WikiCite at WMF Monthly Metrics
On September 29, a short retrospective on WikiCite was presented during the September 2016 Wikimedia Monthly Activity and Metrics Meeting https://meta.wikimedia.org/wiki/Wikimedia_Foundation_metrics_and_activities_meetings/2016-09 (livestream https://www.youtube.com/watch?v=_grZDl3TJFc&t=1870)
Grant proposals
Three proposals closely-related to WikiCite are applying for funding through Wikimedia Grants https://meta.wikimedia.org/wiki/Grants:Project/Browse_applications:
WikiFactMine
WikiFactMine https://meta.wikimedia.org/wiki/Grants:Project/WikiFactMine is a proposal by the ContentMine team to harvest the scientific literature for facts and recommend them for inclusion in Wikidata.
Librarybase
Librarybase https://meta.wikimedia.org/wiki/Grants:Project/Harej/Librarybase:_an_online_reference_library is a proposal to build an "online reference library" for WIkimedia contributors, leveraging Wikidata.
StrepHit
The StrepHit https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal team submitted a grant renewal application to support semi-automated reference recommendation for Wikidata statements.
Data releases First release of the Open Citation Corpus
The OpenCitations project http://opencitations.net/ announced the first release https://twitter.com/opencitations/status/780398561607442434 of the Open Citation Corpus http://opencitations.net/corpus, an "open repository of scholarly citation data made available under a Creative Commons public domain dedication (CC0), which provides accurate bibliographic references harvested from the scholarly literature that others may freely build upon, enhance and reuse for any purpose, without restriction under copyright or database law." The OpenCitation project uses provenance and SPARQL for tracking changes in the data http://rawgit.com/essepuntato/opencitations/master/paper/occ-driftalod2016.html.
Data on DOI citations in Wikipedia from Crossref
Crossref https://en.wikipedia.org/wiki/Crossref recently announced https://twitter.com/Wikicite/status/780763142737453056 the beta version of the Crossref Event Data user guide http://eventdata.crossref.org/guide/, which provides information on mentions of Digital Object Identifiers https://en.wikipedia.org/wiki/Digital_Object_Identifiers (DOI) across non-scholarly sources. The guide includes a detailed overview http://eventdata.crossref.org/guide/#wikipedia of how the system collects and stores DOI citations from Wikimedia projects, and how this data can be programmatically retrieved via the Crossref APIs.
Code releases Converting Wikidata entries to BibTeX
ContentMine fellow Lars Willghagen announced a tool https://larsgw.blogspot.nl/2016/09/citationjs-on-command-line.html combining Citation.js with Node.js, which allows, among other things, to convert a list of bibliographic entries stored as Wikidata items into a BibTex file.
Research Finding news citations for Wikipedia
Besnik Fetahu http://www.l3s.de/~fetahu/ (Leibniz University of Hannover) presented his research on news citation recommendations for Wikipedia at the Wikimedia Research showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#September_2016 (slides http://www.slideshare.net/BesnikFetahu/finding-news-citations-for-wikipedia, video https://www.youtube.com/watch?v=fTDkVeqjw80). In his own words, "in this work we address the problem of finding and updating news citations for statements in entity pages. We propose a two-stage supervised approach for this problem. In the first step, we construct a classifier to find out whether statements need a news citation or other kinds of citations (web, book, journal, etc.). In the second step, we develop a news citation algorithm for Wikipedia statements, which recommends appropriate citations from a given news collection."
DBpedia Citation Challenge
Krzysztof Węcel http://kie.ue.poznan.pl/en/member/krzysztof-wecel (Poznań University of Economics and Business https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Economics_and_Business) presented his research (slides http://www.slideshare.net/KrzysztofWecel/dbpedia-citation-challenge-not-only-polish-citations-in-wikipedia-analysis-comparison-directions) in response to the DBpedia Citations and References Challenge http://wiki.dbpedia.org/blog/dbpedia-citations-references-challenge, analyzing content in Belarusian, English, French, German, Polish, Russian, Ukrainian and showing how citation analysis can improve the modeling of quality of Wikipedia articles.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata