On 05/22/2014 05:41 PM, Petr Bena wrote:
I was looking for a free (possibly open source)
provider of automatic
translations for my open source application I am working on and quite
had troubles finding some. Then I realized we have a project called
"wiktionary" which could possibly (I was assuming it's open
dictionary) help me here, but I was quite disappointed as I couldn't
find any simple way to perform simple queries like:
There are several open-source machine translation projects.
They are either rule-based or statistics-based. One of the
rule-based projects is Apertium.
When you start from zero, building a rule-based system
gives you a useful system quite fast, especially if the
two languages are similar. A statistics-based system (such
as Google Translate) requires enormous amounts of
data to become useful.
It's not something that you can start as a subproject
within Wiktionary, not even as a separate WMF project.
It's a very large task.
One naive approach is to base a statistics-based
machine translator (SMT) on the European Union's
freely available parallel text corpus. When you try
to translate Finnish "terve" (which means: hello!)
into English in such a system, it will say "health",
since the same word also means health, and EU
texts only talk about healthcare, never "hello".
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se