Tamil Wikipedians seems to have worked on this , Whether this is experimented by any other Indian Wikipedia ?

July 15th, 2010

Earlier today the folks over at Google provided an update on their progress using Translation Toolkit with volunteers and translators to improve the article count in smaller language versions of Wikipedia, including Arabic, Gujarati, Hindi, Kannada, Swahili, Tamil and Telugu.  Google is a passionate believer in the need to translate and bring more high quality works of text to less-represented languages on the web.

Michael Galvez, a Product Manager from Google, presented the recent findings of these efforts at this year’s Wikimania in Gdańsk – which wrapped up on Sunday, July 11 of this year.

From Michael’s post:

We believe that translation is key to our mission of making information useful to everyone. For example, Wikipedia is a phenomenal source of knowledge, especially for speakers of common languages such as English, German and French where there are hundreds of thousands—or millions—of articles available. For many smaller languages, however, Wikipedia doesn’t yet have anywhere near the same amount of content available.

Google is reporting an increase of about 16 million words so far due to the efforts of local volunteers and translators using the Translation Toolkit.  In Hindi Wikipedia these efforts have resulted in an increase in size of about 20 per cent. They continue their work directly with volunteers from these language projects, and continue to expand the capabilities of the translation toolkit in new languages.

A big thanks for the ongoing efforts of the volunteers and translators, and to Google for continuing to invest time and resources in this great translation system.

Jay Walsh, Communications

4 Responses to “Update on Translation Toolkit”

  1. contractors tax Says:

    Google’s translation work is amazing, world changing to a degree. I have friends in Hungary and Russia who write to me entire emails in their own language. I can use the google translate and their letters appear in highly readable english. And I can reply. And in chrome it is possible to translate webpages on the fly. Amazing. Game changing. Thanks to google, wikipedia etc for bringing me closer to my friends!
    Nathan C., UK

  2. Tobias (User:Church of emacs) Says:

    Great project!

    I don’t know much about this tool, so please forgive me if my question is stupid: Most “big” Wikipedia language versions are quite strict in copyright issues, e.g. respecting the license and attributing authors properly. Large efforts, like transwikiimport or importupload are taken to ensure that translated articles contain the history of the original page. So my question is: how does author attribution work with the google toolkit? Are the authors of the original article properly attributed? Is the history imported?

    Seeing that this is a medium/large software project, one would expect that license issues are considered as well. On the other hand, Wikipedia versions evolve as they grow, and usually they develop an understanding for copyright issues in late parts of the project – so “write articles first, care about copyright later” is a valid argument. Additionally, GFDL was much stricter than CC-BY-SA, so the strictness of policies (which sometimes require a copy of the page history) might not be needed anymore; I’m not sure about that.

  3. A. Ravishankar Says:

    Please also see

    What happened on the Google Challenge @ the Swahili Wikipedia

    A Review on Google Translation project in Tamil Wikipedia

  4. Mayooresan Says:

    Google’s Tool Kit is a amazing tool but is it acceptable to allow Google to use Wikipedia as a testing platform for their project???. Obviously they encourage people to translate using Toolkit because they want more “Translation Memory” in many languages. I personally believe we should not encourage such efforts where GNU project and its volunteers should not be used for a proprietary reserved product.

    The problem denoted by the Swahili language Wikipedia is so scary… Too much of anything is not good!!