http://tabula.nerdpower.org/ Tabula: Turn tables within PDFs into CSVs.
More information at
http://source.mozillaopennews.org/en-US/code/tabula/ .
I imagine there are some people on this list who have access to PDFs of
openly licensed data they'd like to get into Wikidata (from corporate or
government sources who don't provide easy-to-work-with dumps or APIs).
I heard about Tabula last night and thought the following flow sounded
plausible:
1) get PDFs
2) run them through Tabula to get CSVs
3) use a pywikipediabot script to upload rows to Wikidata
Happy adding!
--
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation