[Foundation-l] Is Google translation is good for Wikipedias?

Jimmy O'Regan joregan at gmail.com
Sun Aug 1 01:19:28 UTC 2010


On Sun, 25 Jul 2010 11:04:42 -0300, Fajro wrote:

> On Sun, Jul 25, 2010 at 8:33 AM, Mark Williamson
> <node.ue at gmail.com> wrote:
> 
>> about the toolkit, but I got the impression you're referring to Google
>> Translate, which I agree is always unsuitable to produce usable
>> articles.
>>
>>
> Machine translation is always unsuitable to produce usable articles, but
> can help to start new ones in smaller wikipedias.
> 

Unedited MT is always unsuitable, rather.

> If we want to use machine translation we should try with a free project
> like Apertium:
> 

Apertium *is* used to translate Wikipedia articles. The difference is, we 
concentrate on producing rule-based translators between related 
languages, where the results can be quite impressive. I wouldn't 
recommend that anyone use our English-Catalan translator for a Wikipedia 
article - there will simply be too much work involved in making it 
readable. Our Spanish-Catalan translator, on the other hand, will do 
quite a good job of it.

In theory, statistical MT should also be better with related languages 
(though I haven't seen anyone working on it). Google isn't 'pure' SMT 
though; much of their resources come from translating via English, so 
even when there's no ambiguity between two languages, Google will find 
some based on English.

The quality of translation of an SMT system greatly depends on the type 
of text it was trained with. Articles relating to computing, law, and 
medicine will translate much better than, say, articles about history, 
because those are the types of text for which translations are most 
widely available.




More information about the foundation-l mailing list