Dario and I just released our first static dump of identifiers. Right now,
it only includes PubMed identifiers, but I'm running an extraction right
now to add DOIs. It turns out that they are non-trivial to extract with
regexes[1] alone, so I wrote an island parser to extract them from
wikimarkup[2] that seems to perform very well.
Halfaker, Aaron; Taraborelli, Dario (2015): Scholarly article citations in
Wikipedia. figshare.
http://dx.doi.org/10.6084/m9.figshare.1299540
Retrieved 22:25, Feb 05, 2015 (GMT)
1.
http://stackoverflow.com/questions/27910/finding-a-doi-in-a-document-or-page
2.
https://github.com/halfak/Extract-scholarly-article-citations-from-Wikipedi…
-Aaron
On Thu, Feb 5, 2015 at 4:13 PM, Jake Orlowitz <jorlowitz(a)gmail.com> wrote:
Thanks, saw that! Really neat. We're working on
it with Analytics :)
On 2/5/15, Pine W <wiki.pine(a)gmail.com> wrote:
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
*This is an Encyclopedia* <https://www.wikipedia.org/>
*One gateway to the wide garden of knowledge, where lies The deep rock of
our past, in which we must delve The well of our future,The clear water
we
must leave untainted for those who come after
us,The fertile earth, in
which truth may grow in bright places, tended by many hands,And the broad
fall of sunshine, warming our first steps toward knowing how much we do
not
know.*
*—Catherine Munro*
--
Jake Orlowitz
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l