I downloaded some of the "free dictionaries" on sourceforge.net, but they are in an own special format - they are coded and I cannot "see" the data in clear text format - that's why I myself really did not concentrate too much on this format.
Well, DICTs consist normally of a gzipped hypertext file together with an index file that allows finding entries quickly. The freedict project uses something like the TEI format, which is basically XML.
Nokia is doing stuff in this direction ... I must see if I can find the link with the notes - these could be interesting as well. I mean: we have too many different formats ... that's not good.
Well, Nokia seems to have opted for free software for this new device that'll be sold later this year.
I believe THE format for dictionary information is TBX since it is nothing else than XML with fixed tags and this allows for easy data exchange.
I don't think there needs to be one format only. :) On Wikimedia servers it will probably be MySQL, and then people should be able to download the UW in a bunch of formats. On my mobile phone however, I wouldn't want to lose too much space on XML tags or uncompressed text.
I did not know you were in there. Maybe it is really time to talk about single facts - but maybe first offline. Make data easier available is part of what we are thinking about. The more we are and the more constructive are contributions the better we will be and the less effort will be necessary.
Hmmm... :) as a sidenote: wik2dict can now create Debian packages. It would be cool if someone could offer server space and bandwidth to put these up.
Another step is then the data exchangeability with CAT-Tools like OmegaT, but that's a story of its own ... if we start to talk about this here now ... I am just imagining what would happen ;-)
I had never heard of CAT (Computer Aided Translation) or OmegaT (GPL'd Java software). But I do now. :) Well, just inform the folks from OmegaT. I guess they should be interested in getting this done.