[Wikimedia-l] The case for supporting open source machine translation

George Herbert george.herbert at gmail.com
Thu Apr 25 02:49:16 UTC 2013


Leslie Carr wrote (personally, not officially):

I think that while supporting open source machine translation is an
> awesome goal, it is out of scope of our budget and the engineering
> budget could be better spent elsewhere, such as with completing
> existing tools that are in development, but not
> deployed/optimized/etc.  I think that putting a bunch of money into
> possibilities isn't the right thing to do when we have a lot of
> projects that need to be finished and deployed yesterday.  Maybe once
> there's a closer actual project we could support them with text
> streams, decommissioned machines, and maybe money, but only after it's
> a pretty sure "investment"


I don't think that it's a good idea to shift resources to it immediately,
but I think that every now and then it's very healthy to step back and ask
"What is standing between our users and the information they seek?  What is
standing between our editors and the information they want to update?".
 Generically, the customers and customer goals problem, applied to WMF's
two customer sets (readers, and editors).

Minor UI changes help readers.  Most of the other changes are
editor-focused, retention or ease of editing or various other things
related to that.  A few are strategic-data-organization related which are
more of a multiplier effect.

The readers and potential readers ARE however clearly disadvantaged by
translation issues.

I see this discussion and consideration as strategic; not planning (year,
six month) timescales or tactical (month, week) timescales, but a
multi-year "What are our main goals for information access?" timescale.

We can't usefully help with internet access (and that's proceeding at good
pace even in the third world), but language will remain a barrier when
people get access.  In a few situations politics / firewalling is as well
(China, primarily), which is another strategic challenge.  That, however,
is political and geopolitical, and not an easy nut for WMF to crack.  Of
the three issues - net, firewalling, and language, one of them is something
we can work on.  We should think about how to work on that.  MT seems like
an obvious answer, but not the only possible one.




On Wed, Apr 24, 2013 at 12:29 PM, Leslie Carr <lcarr at wikimedia.org> wrote:

> (FYI this is me speaking with my personal hat on, none of these
> opinions are official in any way or the opinions of the foundation as
> an organization)
>
> <personal_hat>
>
> >
> > While Wikimedia is still only a medium-sized organization, it is not
> > poor. With more than 1M donors supporting our mission and a cash
> > position of $40M, we do now have a greater ability to make strategic
> > investments that further our mission, as communicated to our donors.
> > That's a serious level of trust and not to be taken lightly, either by
> > irresponsibly spending, or by ignoring our ability to do good.
> >
> > Could open source MT be such a strategic investment? I don't know, but
> > I'd like to at least raise the question. I think the alternative will
> > be, for the foreseeable future, to accept that this piece of
> > technology will be proprietary, and to rely on goodwill for any
> > integration that concerns Wikimedia. Not the worst outcome, but also
> > not the best one.
>
> I think that while supporting open source machine translation is an
> awesome goal, it is out of scope of our budget and the engineering
> budget could be better spent elsewhere, such as with completing
> existing tools that are in development, but not
> deployed/optimized/etc.  I think that putting a bunch of money into
> possibilities isn't the right thing to do when we have a lot of
> projects that need to be finished and deployed yesterday.  Maybe once
> there's a closer actual project we could support them with text
> streams, decommissioned machines, and maybe money, but only after it's
> a pretty sure "investment"
>
> </personal_hat>
>
> Leslie
>
> >
> > Are there open source MT efforts that are close enough to merit
> > scrutiny? In order to be able to provide high quality result, you
> > would need not only a motivated, well-intentioned group of people, but
> > some of the smartest people in the field working on it.  I doubt we
> > could more than kickstart an effort, but perhaps financial backing at
> > significant scale could at least help a non-profit, open source effort
> > to develop enough critical mass to go somewhere.
> >
> > All best,
> > Erik
> >
> > [1]
> http://stats.wikimedia.org/wikimedia/animations/growth/AnimationProjectsGrowthWp.html
> > [2] https://developers.google.com/translate/v2/pricing
> > --
> > Erik Möller
> > VP of Engineering and Product Development, Wikimedia Foundation
> >
> > Wikipedia and our other projects reach more than 500 million people every
> > month. The world population is estimated to be >7 billion. Still a long
> > way to go. Support us. Join us. Share: https://wikimediafoundation.org/
> >
> > _______________________________________________
> > Wikimedia-l mailing list
> > Wikimedia-l at lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>
>
>
> --
> Leslie Carr
> Wikimedia Foundation
> AS 14907, 43821
> http://as14907.peeringdb.com/
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>



-- 
-george william herbert
george.herbert at gmail.com


More information about the Wikimedia-l mailing list