[Wikimedia-l] The case for supporting open source machine translation

Mark delirium at hackish.org
Wed Apr 24 10:25:58 UTC 2013


On 4/24/13 8:29 AM, Erik Moeller wrote:
> Are there open source MT efforts that are close enough to merit
> scrutiny? In order to be able to provide high quality result, you
> would need not only a motivated, well-intentioned group of people, but
> some of the smartest people in the field working on it.  I doubt we
> could more than kickstart an effort, but perhaps financial backing at
> significant scale could at least help a non-profit, open source effort
> to develop enough critical mass to go somewhere.
>

I do think this is strategically relevant to Wikimedia. But there is 
already significant financial backing attempting to kickstart 
open-source MT, with some results. The goal is strategically relevant to 
another, much larger organization: the European Union. From 2006 through 
2012 they allocated about $10m to kickstart open-source MT, though 
focused primarily on European languages, via the EuroMatrix (2006-09) 
and EuroMatrixPlus (2009-12) research projects. One of the concrete 
results [1] of those projects was Moses, which I believe is currently 
the most actively developed open-source MT system. 
http://www.statmt.org/moses/

In light of that, I would suggest trying to see if we can adapt or join 
those efforts, rather than starting a new project or organization. One 
strategy could be to: 1) fund internal Wikimedia work to see if Moses 
can already be used for our purposes; and 2) fund improvements in cases 
where it isn't good enough yet (whether this is best done through grants 
to academic researchers, payments to contractors, hiring internal staff, 
or posting open bounties for implementing features, I haven't thought 
much about).

Best,
Mark

[1] They have a nice list of other software and data coming out of the 
project as well: http://www.euromatrixplus.net/resources/



More information about the Wikimedia-l mailing list