On Wed, Apr 24, 2013 at 11:59 AM, Erik Moeller <erik(a)wikimedia.org> wrote:
Could open source MT be such a strategic investment? I
don't know, but
I'd like to at least raise the question. I think the alternative will
be, for the foreseeable future, to accept that this piece of
technology will be proprietary, and to rely on goodwill for any
integration that concerns Wikimedia. Not the worst outcome, but also
not the best one.
There is a compelling need to assess availability of training corpus
of significant breadth and depth for the languages. Most open-source
implementations of MT end up hitting this hurdle because content of
scale is not easily available. It would be appropriate to decide
whether WMF/Wikipedia is well placed to turn on a firehose like API
that would enable MT implementations to use statistical and other
methods on the existing content itself.
--
sankarshan mukhopadhyay
<https://twitter.com/#!/sankarshan>