[Wikimedia-l] The case for supporting open source machine translation

Federico Leva (Nemo) nemowiki at gmail.com
Wed Apr 24 09:08:05 UTC 2013


Erik Moeller, 24/04/2013 10:06:
> [...] Moreover, the lens of project/domain name is a very arbitrary one to
> define vertically focused efforts.

A good and interesting reasoning here. Indeed something to keep in mind, 
but which adds problems.

> There are specialized efforts
> within Wikipedia that have more scale today than some of our sister
> projects do, such as individual WikiProjects. There are efforts like
> the partnerships with cultural institutions which have led to hundreds
> of thousands of images being made available under a free license. Yet
> I don't see you complaining about lack of support for GLAM tooling, or
> WikiProject support (both of which are needed).

You're perhaps right about MZ, but surely GLAM tooling is something 
often asked; however it arguably falls under Commons development.
I've no idea of what WikiProject support you have in mind, and surely 
WikiProjects are too often dangerous factions to be disbanded rather 
than encouraged, but we may agree in principle.

> Why should English
> Wikinews with 15 active editors demand more collective attention than
> any other specialized efforts?
>
> Historically, we've drawn that project/domain name dividing line
> because starting a new wiki was the best way to put a flag in the
> ground and say "We will solve problem X". And we didn't know which
> efforts would immediately succeed and which ones wouldn't. But in the
> year 2013, you could just as well argue that instead of slapping the
> Wikispecies logo on the frontpage of Wikipedia, we should make more
> prominent mention of "How to contribute video on Wikipedia" or "Work
> with your local museum" or "Become a campus ambassador" or any other
> specialized effort which has shown promise but could use that extra
> visibility.

Again, "how to contribute video" is just Commons promotion, work with 
museums is usually either Commons or Wikipedia (sometimes Wikisource), 
campus ambassadors are a program to improve some articles on some 
Wikipedias.
What I mean to say is those are means rather than goals; you're not 
disagreeing with MZ that we shouldn't expand our goals further.

> The idea that just because user X proposed project Y
> sometime back in the early years of Wikimedia, effort Y must forever
> be part of a first order prioritization lens, is not rationally
> defensible.
>
> So, even when our goal isn't simply to make general site improvements
> that benefit everyone but to support specialized new forms of content
> or collaboration, I wouldn't use project/domain name division as a
> tool for assessing impact, but rather frame it in terms of "What
> problem is being solved here? Who is going to be reached? How many
> people will be impacted"? And sometimes that does translate well to
> lens of a single domain name level project, and sometimes it doesn't.
>
>> There's a general trend currently within the Wikimedia Foundation to
>> "narrow focus," which includes shelling out third-party MediaWiki release
>> support to an outside contractor or group, because there are apparently
>> not enough resources within the Wikimedia Foundation's 160-plus staff to
>> support the Wikimedia software platform for anyone other than Wikimedia.
>
> It's not a question whether we have enough resources to support it,
> but how to best put a financial boundary around third party
> engagement, while also actually enabling third parties to play an
> important role in the process as well (including potentially chipping
> in financial support).
>
>> In light of this, it seems even more unreasonable and against good sense
>> to pursue a new machine translation endeavor, virtuous as it may be.
>
> To be clear: I was not proposing that WMF should undertake such an
> effort directly. But I do think that if there are ways to support an
> effort that has a reasonable probability of success, with a reasonable
> structure of accountability around such an engagement, it's worth
> assessing. And again, that position is entirely consistent with my
> view that WMF should primarily invest in technologies with broad
> horizontal impact (which open source MT could have) rather than
> narrower, vertical impact.

In other words we wouldn't be adding another goal alongside those of 
creating an encyclopedia, a media repository, a dictionary, a dictionary 
of quotations etc. etc. but only a tool to the extent needed by one or 
more of them?
Currently the only projects using machine translation or translation 
memory are our backstage wikis, the MediaWiki interface translation and 
some highly controversial article creation drives on a handful small 
wikis (did they continue in the last couple years?). Many ways exist to 
expand the scope of such a tool and the corpus we could provide to it, 
but the rationale of your proposal is currently a bit lacking and needs 
some work, just this.

Nemo



More information about the Wikimedia-l mailing list