Hey folks,
Dario and I just updated the scholarly citations dataset to include Digital Object Identifiers. We found 742k citations (524k unique DOIs) in 172k articles. Our spot checking suggests that 98% of these DOIs resolve. The remaining 2% were extracted correctly, but they appear to be typos.
Like the dataset that we released for PubMed Identifiers, this dataset includes the first known occurrence of a DOI citation in an English Wikipedia article and the associated revision metadata, based on the most recent complete content dump of English Wikipedia.
-Aaron