[Foundation-l] Is Google translation is good for Wikipedias?

Jimmy O'Regan joregan at gmail.com
Sun Aug 1 02:30:49 UTC 2010


On Sun, 25 Jul 2010 18:10:54 +0300, Amir E. Aharoni wrote:

> 2010/7/25 Shiju Alex <shijualexonline at gmail.com>:
>> Hello All,
>>
>> Recently there are lot of discussions (in this list also) regarding the
>> translation project by Google for some of the big language wikipedias.
>> The foundation also seems like approved the efforts of Google. But I am
>> not sure whether any one is interested to consult the respective
>> language community to know their views.
> 
> At the same session at Wikimania a very sensible approach was presented
> by Mikel Iturbe from the Basque Wikipedia:
> 
> * They didn't use Google Translate, but an academically-developed tool,
> which also happened to be Free Software - which diminished the arguments
> about commercialization.
> 

Probably Matxin (http://sourceforge.net/projects/matxin/)

Matxin is somewhat related to Apertium, which I am involved with. Some 
Apertium developers tried to make it less Basque-specific, but weren't 
entirely successful.

> * The editors community was involved throughout the whole process.
> 
> * Articles were not uploaded without correcting mistakes that the
> translation software made.
> 
> * What's also important, the corrections were reported to the
> translation software developers, so they would try to improve it.
> 
> Of course, not every language community can afford developing
> Free-as-in-speech academic translation software, but the other points
> are useful to everybody.

Depending on the languages involved, the amount of resources available 
for those languages, and having realistic expectations, a usable system 
can be made in as little as 3-6 months by a single motivated volunteer, 
with help from experienced developers. Earlier this year, at the request 
of Crisis Commons, 3 of us built a Haitian Creole to English prototype in 
less than a week.

Staying motivated is *hard*. We have 2-3 times as many half-working 
prototypes as we have released language pairs. Having realistic 
expectations is hard. People want English, and/or they want to include 
*everything* (budget at least a year of full time work for anything to 
English).

If you know the difference between noun, adjective, and verb, understand 
Zipf's law, and want open source MT for a pair of languages, come find us 
on #apertium on FreeNode. We'll be happy to help.
 
> 
> Mikel Iturbe's presentation:
> * http://www.slideshare.net/janfri/wikimania2010
> 
> The academic papers related to that project: *
> http://ixa.si.ehu.es/openmt2/argitalpenak_html *
> http://ixa.si.ehu.es/Ixa/Argitalpenak/Artikuluak/index_html?
Atala=Artikulua_Itzulpen_automatikoa





More information about the foundation-l mailing list