After a few tweaks, I ended up with a list of about 180k DOIs. https://zenodo.org/record/54799
Nemo
Federico Leva (Nemo), 02/06/2016 12:21:
Today I wrote a small script https://github.com/nemobis/bots/blob/master/doi-doai-openaccess.py that finds, among existing DOI links, those which are available in open access via DOAI.io.
I'm now running the script for the ~40 most visited Wikipedias, but here is the output for the Italian Wikipedia (430 DOIs): https://it.wikipedia.org/wiki/Progetto:Coordinamento/Bibliografia_e_fonti/DO...
I've asked those links to be added/replaced to the existing ones, I think the same should be done on other wikis as well: https://it.wikipedia.org/w/index.php?title=Wikipedia%3ABot%2FRichieste&t...
The next step will be to search DOIs which are mentioned in the articles but not linked, or that are not linked via DOI.org, or that are indicated via their handle instead; and even harder, to find DOIs corresponding to citations which don't mention the DOI at all. What's the best reusable code/tool for this? I remember https://github.com/CrossRef/baleen and https://github.com/edsu/linkypedia but that's not quite the same thing.
Nemo