Hi Nemo
The dataset currently includes the first known occurrence of a PMID or PMCID citation in an English Wikipedia article and the associated revision metadata, based on the most recent complete content dump of English Wikipedia.
Do you accepted patches for inclusion of other wikis? The easiest way to include all Wikimedia projects is probably to use the external links table, we can see how big a difference there is.
we definitely welcome patches and pull requests [1]. This is our current priority list (subject to other priorities unrelated to this project):
1. add other identifiers (DOIs are next) 2. include other languages / projects 3. generate recurring reports (e.g. once a month)
Aaron, does that sound about right? Also note that other people on this list (Max, Daniel) are working on real-time reporting of DOI citations in collaboration with CrossRef.
D
[1] https://github.com/halfak/Extract-scholarly-article-citations-from-Wikipedia