Re: [OpenAccess] Scholarly citations by PMID/PMCID in Wikipedia

3 Feb 2015

      +1.  Right now, we can incorporate other projects by simply running the
same script on other XML dumps.  We'll likely want to set up a job that
tracks the creation of new historical dumps so that we can produce new,
updated ID dumps ASAP.
If we drop the requirement of knowing when a citation was first added to an
article, we could use the externallinks tables.  That would allow us to
generate these datasets much faster.  I'd like to only pursue this option
if we find that processing the dumps becomes difficult to do on the monthly
basis.  Right now, it doesn't look like that will be the case.
The realtime reporting project sounds interesting.  Is there a project page
or some code we could review?
-Aaron
On Tue, Feb 3, 2015 at 9:28 AM, Dario Taraborelli <
dtaraborelli@wikimedia.org> wrote:
...
Hi Nemo
...
...
The dataset currently includes the first known occurrence of a PMID or
PMCID citation in an English Wikipedia article and the associated revision
metadata, based on the most recent complete content dump of English
Wikipedia.
...
Do you accepted patches for inclusion of other wikis? The easiest way to
include all Wikimedia projects is probably to use the external links table,
we can see how big a difference there is.
we definitely welcome patches and pull requests [1]. This is our current
priority list (subject to other priorities unrelated to this project):

add other identifiers (DOIs are next)
include other languages / projects
generate recurring reports (e.g. once a month)

Aaron, does that sound about right? Also note that other people on this
list (Max, Daniel) are working on real-time reporting of DOI citations in
collaboration with CrossRef.
D
[1]
https://github.com/halfak/Extract-scholarly-article-citations-from-Wikipedia

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [OpenAccess] Scholarly citations by PMID/PMCID in Wikipedia