[Wikimedia-l] The case for supporting open source machine translation

Ting Chen wing.philopp at gmx.de
Wed Apr 24 06:39:55 UTC 2013

Oh yes, this would really be great. Just think about the money the 
Foundation gives out meanwhile for translation, plus the many many 
volunteers' work invested into translation. A free and open translation 
software is long overdue indeed. Great idea Erik.


Am 4/24/2013 8:29 AM, schrieb Erik Moeller:
> Wikimedia's mission is to make the sum of all knowledge available to
> every person on the planet. We do this by enabling communities in all
> languages to organize and collect knowledge in our projects, removing
> any barriers that we're able to remove.
> In spite of this, there are and will always be large disparities in
> the amount of locally created and curated knowledge available per
> language, as is evident by simple statistical comparison (and most
> beautifully visualized in Erik Zachte's bubble chart [1]).
> Google, Microsoft and others have made great strides in developing
> free-as-in-beer translation tools that can be used to translate from
> and to many different languages. Increasingly, it is possible to at
> least make basic sense of content in many different languages using
> these tools. Machine translation can also serve as a starting point
> for human translations.
> Although free-as-in-beer for basic usage, integration can be
> expensive. Google Translate charges $20 per 1M characters of text for
> API usage. [2] These tools get better from users using them, but I've
> seen little evidence of sharing of open datasets that would help the
> field get better over time.
> Undoubtedly, building the technology and the infrastructure for these
> translation services is a very expensive undertaking, and it's
> understandable that there are multiple commercial reasons that drive
> the major players' ambitions in this space. But if we look at it from
> the perspective of "How will billions of people learn in the coming
> decades", it seems clear that better translation tools should at least
> play some part in reducing knowledge disparities in different
> languages, and that ideally, such tools should be "free-as-in-speech"
> (since they're fundamentally related to speech itself).
> If we imagine a world where top notch open source MT is available,
> that would be a world where increasingly, language barriers to
> accessing human knowledge could be reduced. True, translation is no
> substitute for original content creation in a language -- but it could
> at least powerfully support and enable such content creation, and
> thereby help hundreds of millions of people. Beyond Wikimedia, high
> quality open source MT would likely be integrated in many contexts
> where it would do good for humanity and allow people to cross into
> cultural and linguistic spaces they would otherwise not have access
> to.
> While Wikimedia is still only a medium-sized organization, it is not
> poor. With more than 1M donors supporting our mission and a cash
> position of $40M, we do now have a greater ability to make strategic
> investments that further our mission, as communicated to our donors.
> That's a serious level of trust and not to be taken lightly, either by
> irresponsibly spending, or by ignoring our ability to do good.
> Could open source MT be such a strategic investment? I don't know, but
> I'd like to at least raise the question. I think the alternative will
> be, for the foreseeable future, to accept that this piece of
> technology will be proprietary, and to rely on goodwill for any
> integration that concerns Wikimedia. Not the worst outcome, but also
> not the best one.
> Are there open source MT efforts that are close enough to merit
> scrutiny? In order to be able to provide high quality result, you
> would need not only a motivated, well-intentioned group of people, but
> some of the smartest people in the field working on it.  I doubt we
> could more than kickstart an effort, but perhaps financial backing at
> significant scale could at least help a non-profit, open source effort
> to develop enough critical mass to go somewhere.
> All best,
> Erik
> [1] http://stats.wikimedia.org/wikimedia/animations/growth/AnimationProjectsGrowthWp.html
> [2] https://developers.google.com/translate/v2/pricing
> --
> Erik Möller
> VP of Engineering and Product Development, Wikimedia Foundation
> Wikipedia and our other projects reach more than 500 million people every
> month. The world population is estimated to be >7 billion. Still a long
> way to go. Support us. Join us. Share: https://wikimediafoundation.org/
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

More information about the Wikimedia-l mailing list