Hi lists,
we are proud to announce that we now host the data we extract from wiktionary publicly on wiktionary.dbpedia.org.
We offer Linked Data: http://wiktionary.dbpedia.org/resource/word a SPARQL endpoint: http://wiktionary.dbpedia.org/sparql and N-Triple Dumps: http://downloads.dbpedia.org/wiktionary/
There is also a wiki explaining some details: http://wiki.dbpedia.org/Wiktionary/
We currently extracted data from the English and German Wiktionary (28M triples and 3.7M triples), but plan to extend that to at least the biggest 5 wiktionaries within the next weeks, as our approach focuses on extendability. The data for each word is structured hierarchically (as wiktionary is) and contains information about language, part of speech, definitions, translations, synonyms, hyperonyms and hyponyms etc. There might be some quality issues, but we want to release early, so bear with us and report major problems.
Thanks goes to the wiktionary community which does a great job creating this dataset, and we hope to enable new use cases and consequently promote the contribution to the wiktionary project.
Regards, Jonas Brekle Department of Computer Science, University of Leipzig Research Group: http://aksw.org
wikitext-l@lists.wikimedia.org