I want to extract just bilingual dictionaries from Wiktionary. Are such files already available? If not, does anyone recommend any method other than downloading a dump, scanning all articles, and parsing the sections between {{trans-top}} and {{trans-bottom}}? Thanks
On 18.12.18 04:25, Thomas Levine wrote:
I want to extract just bilingual dictionaries from Wiktionary. Are such files already available? If not, does anyone recommend any method other than downloading a dump, scanning all articles, and parsing the sections between {{trans-top}} and {{trans-bottom}}? Thanks
In another life ;) we tried to make something like that with the Wiktionary in French and it worked more-or-less. But this is not really robust (lots of edge cases). See: https://github.com/kelson42/kalima
Emmanuel