Brian Suda wrote:
I have been on this list awhile, when i originally joined i was interesting in the possibility of exporting the wiktionary data as .dict format. Now that the newest version of OSX 10.4 has a built-in dictionary that uses the dict:// to look-up words i was interested to see if anyone on the technicaly side would like to explore the possibility of either exporting the Wiktionary database as .dict format, or run a dictionary daemon that would access the wiktionary database server and return dict entries. It would be read-only, but it would be another interesting way to access the wiktionary besides the web interface.
Does anyone on the tech list know if this is even possible? I'm not asking you to do it (i can write the export), i was wondering if there is some sort of database schema available to extract the data into dict format, or are the entries too fragmented to even attempt an export?
-brian
In fact exporting the Wiktionaries is almost impossible right now. I once tried to write a script in Python to convert an English Wiktionary entry in some common, logical format. There are too many different possible ways an entry can be built up. Maybe I'll try it again one day, but my time has become very limited lately. I'm not saying it's entirely impossible, but any solution will need to have some manual input. It's very hard to automate it all the way. Another thing is that the Wiktionary content is mostly not ready yet. We've done an amazing amount of work already and some entries are already quite good, but most of the content needs a few (maybe 10 or 20) more years of work to be able to say before it will be usable by the public. We should be thinking about making it possible to export to various formats right now though. If all you want to to is look up a word and keep whatever is currently on its page and simply present that to the user. (or only grab the part describing the English word) That should be possible and would be very easy to do. Maybe I should look into this dict format, to know whether that would work. I'll go do that right now...
Jo