Ziko, again, we are not talking about machine translations; Google doesn't have machine translation for Bangla, Malayalam, Tamil etc. yet. This is about translation memory.
One of the things about MAT, whose use in the professional translator community is still debated but most popular for translations of time-dependent things like news, is that the original is often a very rough translation that requires a _lot_ of editing. The biggest problem is not the toolkit itself (with some exceptions - punctuation and templates, for example) but the translators who do not bother to use it properly, creating poor translations with lots of spelling mistakes and leaving behind a wasteland of poor quality articles.
GTTK can be used as a force of good if someone puts in the appropriate time and effort; when used _properly_ by a careful, knowledgeable translator who gives ample time for proofreading, articles created with it should be virtually indistinguishable from any other article.
It is my thought that the huge problem here is lack of engagement with communities. Essentially, Google swooped down and started dropping large amounts of poor quality content on our projects without engaging the people from those communities. The people in Google's contest also didn't engage the communities, nor did they respond to requests to improve their content.
-m.
On Wed, Jul 28, 2010 at 7:18 AM, Ziko van Dijk zvandijk@googlemail.com wrote:
2010/7/28 Nathan nawrich@gmail.com:
Just to be sure I understand...
It's good that you ask, indeed. :-)
No, it's not about free software, and the Wikimedians are not too snobby or lazy to correct poor language. That is what I frequently do in de.WP and eo.WP, and I suppose Ragib and many others as well. The point is: The machine translated articles are often so bad that I simply don't understand them. I *cannot* correct them, because I don't know what they are saying.
Kind regards Ziko
What's happening here is that human
beings, using a software tool, are translating articles from the English Wikipedia into a variety of other languages and posting them on the comparatively small Wikipedia projects in these languages. The articles, of unknown intrinsic quality, are usually mid to low quality translations.
In the projects with an active community, some have rejected these articles because they are not high quality and because the community refuses to be responsible for fixing punctuation and other errors made by editors who are not members of the community. In the projects without an active community, Wikimedians (who may not speak any of the languages affected by the Google initiative) are objecting for a variety of other reasons - because the software used to assist translation isn't free, because the effort is managed by a commercial organization or because the endeavor wasn't cleared with the Wikimedia community first. Some are also concerned that these new articles will somehow deter new editors from becoming involved, despite clear evidence that a larger base of content attracts more readers, and more readers plus imperfect content leads to more editors.
What I find interesting is that few seem to be interested in keeping or improving the translated articles; Google's attempt to provide content in under-served languages is actually offending Wikimedians, despite our ostensible commitment to the same goal. Concerns like bureaucratic pre-approval, using free software, etc. are somehow more important than reaching more people with more content. It all seems strange and un-Wikimedian like to me. Obviously there are things Google should have done differently. Maybe working with them to improve their process should be the focus here?
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
-- Ziko van Dijk Niederlande
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l