Thanks for the very helpful answers.
I will look at the possibilities for uploading (and licensing) the data sets.
Meanwhile I have another question. Currently I don't parse any information other than the words or expressions, meaning gender and other language-specific information is ignored, even though they might appear in the translation tables. This is probably a huge problem for large Wiktionaries (e.g. I doubt that the enwiktionary would accept French nouns without their gender). Adding this functionality would be very tedious and probably impossible for languages I can't even read. Should I try it anyway or can the data be useful without these?
2013/10/9 Federico Leva (Nemo) nemowiki@gmail.com
Judit, Ács, 08/10/2013 12:21:
Do you think there is a way to contribute this dictionary back to
Wiktionary?
Sure! You could first of all upload the dataset with a free license somewhere, for instance archive.org. Actually, it's probably better if you choose CC-0 as "license", otherwise – being EU-based – you could add database rights which would be a nightmare. (Or CC-0 for your work + CC-BY-SA for any copyrightable text from Wiktionary, if there is any.)
Then, you can build upon one of out WebAPI clients to contribute it directly to Wiktionary: https://www.mediawiki.org/**wiki/API:Client_codehttps://www.mediawiki.org/wiki/API:Client_code I say "you" because you are the ones knowing your own dataset better. You need local consensus of course, so you could proceed this way:
- determine what Wiktionary editions has the biggest overlap with your
entries (i.e. which would require less page creation; adding to existing pages is less controversial than adding new ones); 2) propose to those editions, or wait for the most interested to ask you, and get local green light (ideally a not-so-huge one to start with); 3) run on your own a bot on that language and identify what's the kind and amount of needed work; 4) share the code and information from (3) to let others continue on other editions. Of course someone else could do 1-3 too, but it would be a disproportionate effort for them compared to you; peer review of the code at (3) should also help make the coding of the bot a shared effort.
Nemo