Let me first try to explain to the list what Google is up to: They're writing statistical translation software and they need material to train it. "Statistical" means they look for a word or a phrase in a piece of English text and try to find a match in a translated version of the text. Then they see what translation occurred most often and try to figure out what neighbouring words trigger what translation.
For example, their first attempt may reveal that "tap" can be translated into two or more Afrikaans words ("kraan" and "tik"). Then statistical analysis will reveal that the presence of the word "water" near "tap" will significantly increase the probability of that "kraan" is the correct translation.
We must assume that Google is doing it for the money and they are spending a lot of money on R&D. But they are not the only company working on it, e.g. Babelfish. So if they reach a certain level in a certain year, then their nearest rival may match that level a few years later. And a decade later, someone may even publish an open source app that achieves that level. (It's a very short time in the context of a language that will exist for a millennium) So if we create a body (or "corpus") of translated text, it will be used over and over again.
All the competition and the rapidly falling price of computing power will mean that the service will never cost more than a few cents per word. Imagine sending an SMS to your domestic, having it translated into her language with an ad at the end. If translating stuff on wikipedia helps a little bit to speed it up, then I think it's a good thing.
-- On the link that Achal sent, there is some discussion around the fact that paying for contributions will reduce the quality. It may for instance create an incentive for someone to copy from a copyrighted source, or to start making things up. Fortunately those problems are not really present when paying for translations, as long as there is some degree of quality control e.g. by taking a sample of the result, translating it back into English and comparing it with source.
On Sat, Sep 11, 2010 at 3:16 PM, Achal Prabhala aprabhala@gmail.com wrote:
Dear Dwayne,
I've followed the work of translate.org.za and congrats on everything accomplished so far. Since you raised the issue of translation, I wanted to point you to a robust discussion that happened recently in India around Google's translation project. You can see archives of the conversation at: http://lists.wikimedia.org/pipermail/wikimediaindia-l/2010-April/thread.html... (Subj: Philosophical view on Google translated articles).
Reactions around google's translation have been mixed. The upsides are clear, and the downsides (as expressed in that conversation) were:
- a dissonance between volunteer editors' contributions and the translations
- a lack of necessity or specificity to some of the translated articles
(marginal western figures who are unknown in, say, Tamil Nadu, etc.)
- some suspicion as to the motives behind the project (given google's
involvement)
- some broader questions, in terms of volunteer vs 'paid' editing and
what the spirit of editing Wikipedia is
In general, I think that translation, if cleverly applied in a customised way, could be useful, and when applied badly could be terrible - but, regardless, it's for the community to decide. Obviously, given the mission of translate.org.za, you would come with a degree of trust and acceptability that a corporation like google doesn't always necessarily bring (which is not to imply that their project is necessarily not helpful - at the moment, I believe various groups of Indian wikipedians are going ahead with talks and discussions on the trial). And when you talk about translation, have you had experience with written material that goes beyond interfaces and templates? (if you have, that experience might be useful to share). Also wondering if your goal is to build a tool (like google) that is constantly improved through human interaction and input, or to run the translation exercise as a collaborative, human-input-based exercise?
Perhaps a good way to think about this is to ask if a particular language community within the various South African wikipedias is interested in taking you up on this. And then run some kind of identification exercise - perhaps an ongoing project - where community members deposit articles they'd like to have translated from X language to Y language in a box. Articles from translation don't necessarily have to come from a strong Wikipedia (like English) or even an emerging Wikipedia like Afrikaans - they could well be within several smaller Wikipedias, eg Sotho to Zulu, etc. Finally, in terms of how things are translated, their quality, and style, I think this is where it is key to get community members involved to minimise conflicts and maximise usefulness of the end result.
Cheers, Achal
WikimediaZA mailing list WikimediaZA@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediaza