Thanks for the very helpful answers.
I will look at the possibilities for uploading (and licensing) the data
sets.
Meanwhile I have another question. Currently I don't parse any information
other than the words or expressions, meaning gender and other
language-specific information is ignored, even though they might appear in
the translation tables. This is probably a huge problem for large
Wiktionaries (e.g. I doubt that the enwiktionary would accept French nouns
without their gender). Adding this functionality would be very tedious and
probably impossible for languages I can't even read. Should I try it anyway
or can the data be useful without these?
2013/10/9 Federico Leva (Nemo) <nemowiki(a)gmail.com>
Judit, Ács, 08/10/2013 12:21:
Do you think there is a way to contribute this dictionary back to
Wiktionary?
Sure! You could first of all upload the dataset with a free license
somewhere, for instance
archive.org. Actually, it's probably better if
you choose CC-0 as "license", otherwise – being EU-based – you could add
database rights which would be a nightmare. (Or CC-0 for your work +
CC-BY-SA for any copyrightable text from Wiktionary, if there is any.)
Then, you can build upon one of out WebAPI clients to contribute it
directly to Wiktionary:
https://www.mediawiki.org/**wiki/API:Client_code<https://www.mediawiki.o…
I say "you" because you are the ones knowing your own dataset better. You
need local consensus of course, so you could proceed this way:
1) determine what Wiktionary editions has the biggest overlap with your
entries (i.e. which would require less page creation; adding to existing
pages is less controversial than adding new ones);
2) propose to those editions, or wait for the most interested to ask you,
and get local green light (ideally a not-so-huge one to start with);
3) run on your own a bot on that language and identify what's the kind and
amount of needed work;
4) share the code and information from (3) to let others continue on other
editions.
Of course someone else could do 1-3 too, but it would be a
disproportionate effort for them compared to you; peer review of the code
at (3) should also help make the coding of the bot a shared effort.
Nemo