[Wikimediaindia-l] [Wikimedia-BD] Updates on Google translation project in Tamil Wikipedia

BalaSundaraRaman sundarbecse at yahoo.com
Fri Dec 3 04:56:35 UTC 2010


Hi Ragib, Belayet,

Good to see you here. :)

Shall I forward your email to the Google Translation team? If they wish so, they 
can take it from there.
What do you think?

Cheers,
Sundar
 "That language is an instrument of human reason, and not merely a medium for 
the expression of thought, is a truth generally admitted."
- George Boole, quoted in Iverson's Turing Award Lecture



----- Original Message ----
> From: Ragib Hasan <ragibhasan at gmail.com>
> To: Discussion list for Bangladeshi Wikimedians 
><wikimedia-bd at lists.wikimedia.org>
> Cc: wikimediaindia-l at lists.wikimedia.org
> Sent: Fri, December 3, 2010 10:01:15 AM
> Subject: Re: [Wikimediaindia-l] [Wikimedia-BD] Updates on Google translation 
>project in Tamil Wikipedia
> 
> I agree with what Belayet mentioned about the Google translated
> articles on  Bengali Wikipedia. So far, we have not been contacted by
> Google directly.  Rather, we have dealt with the paid contractors who
> were hired by Google. As  Belayet has said, the paid translators do not
> follow up with the translations  (except for a single case). Which in
> turn, causes a lot of problems for us to  fix the articles.
> 
> Technically, we have not "banned" Google translated  articles or
> contributors who use GTT. Rather what we have discouraged is  the
> dump-and-run translators who just dump their malformed translation  and
> never responds to our messages or makes a second edit to the  article
> to fix it. In most of the cases we dealt with recently, we  haven't
> even deleted the articles ... rather we moved them to user  space,
> giving the user a chance to fix the article to readable  Bengali.
> 
> So, basically, here are the issues:
> 
> 1. We'd love to work  with Google if they collaborate with us and take
> responsibility of producing  readable content.
> 
> 2. Which means, translations can't be dump-and-run  jobs. Since the
> translator toolkit is still horrible in  English-to-Bengali
> translation, the Google team or their translators need to  do fix the
> pages to make them readable and grammatically correct.
> 
> 3.  We can use the current system ... i.e., translators are free to do
> all of the  sandboxing/translation experiments in their user space. We
> are very picky  about the content that goes into the article space and
> don't want half-done,  incorrect language articles to go there. So,
> after a translation article has  been approved by the community, it can
> be moved to the main space.
> 
> But  to do any of the above, the Google team needs to contact us. We
> don't know  who is or who isn't working for Google. Almost all the
> vendors/contractors we  have dealt so far used thorwaway accounts that
> are used only once, and never  again.
> 
> Bottom line: we are happy to work with Google, but only if Google  does
> not bypass the existing Bengali wikipedian  community.
> 
> Thanks,
> 
> Ragib
> 
> User:Ragib on bn and  en
> 
> 
> --
> Ragib Hasan, Ph.D
> NSF Computing Innovation Fellow  and
> Assistant Research Scientist
> 
> Dept of Computer Science
> Johns  Hopkins University
> 3400 N Charles Street
> Baltimore, MD  21218
> 
> Website:
> http://www.ragibhasan.com
> 
> 
> 
> On Thu, Dec 2,  2010 at 9:19 PM, Belayet Hossain <bellayet at gmail.com> wrote:
> >  Ravi,
> > That's a nice process to deal with Google translation project. In  Bengali
> > Wikipedia, if the translation is not in acceptable quality  community also
> > shift content to the user namespace of that translator  and ask them to
> > improve it. But there are very few examples that  translator rewrite or
> > retouch the article to improve it. Translators are  not coming back to take
> > care of their article. So a lot of untouched bad  translated articles are in
> > the user namespace at Bengali  Wikipedia.
> >
> > And from my experiences in Bengali Wikipedia the  translators are not
> > consistent with their translations. If you rated  someone for the first time
> > for his first translation, it is not be sure  the second translation will be
> > same quality or better than the first  one. So community have to re-rated 
him
> > every time he post an  article.
> >
> > Since the translators are not regular at Wikipedia and  they are not
> > responsive. There is no other contact point available for  the community to
> > communicate with them. We can create a translation  coordination page at
> > local Wikipedia, but there is no way to inform the  existing or new
> > translators to follow the page.
> >
> > I am  very much interested to know, how Tamil community communicating with
> > the  translators at Google?
> >
> > Belayet
> >
> > On 3 December  2010 07:12, Shiju Alex <shijualexonline at gmail.com>  wrote:
> >>
> >> Congrats to Tamil community for trying to bring  out a process for Google's
> >> Translation  project.
> >>
> >> I really wonder how other language communities  are handling this. Apart
> >> from Tamil, Google Translation project is  going on at least in Hindi,
> >> Kannada, and Telugu. It is banned in  Bengali wiki.
> >>
> >> I could see many articles are loaded to  wikis each day. And for many of
> >> them the only contributor is the  Google employee who translated it.
> >>
> >>  Shiju
> >>
> >>
> >>
> >> On Thu, Dec 2, 2010 at  7:55 PM, Arjuna Rao Chavala
> >> <arjunaraoc at googlemail.com>  wrote:
> >>>
> >>> Hi,
> >>>
> >>>  Thanks a lot for the update.
> >>>
> >>> I think the  updated  process is similar with Open source community
> >>>  philosophy when Commercial companies (like IBM, Sun etc) contribute  
>source
> >>> code.   Tamil Wiki has that kind of  rigor in quality  checking and is 
able
> >>> to do a good job. Other Wikipedias  may not  be in a position to engage in 
>a
> >>> similar way, due to policies  and/or  level of active wikipedians. One 
more
> >>> comment  below.
> >>>
> >>> On Thu, Dec 2, 2010 at 7:19 PM,  Ravishankar <ravidreams at gmail.com>  
wrote:
> >>>>
> >>>>  Hi,
> >>>>
> >>>> Some updates on the Google  translation project in Tamil Wikipedia.
> >>>>  --snip
> >>>>
> >>>> We did a quality review of these  articles and found that only around 50%
> >>>> of them has an  acceptable minimum quality regarding translation ( We 
just
> >>>>  rated the style of the language and accuracy in translation. We did not 
>do  a
> >>>> full review on the merit of the  article).
> >>>> --snip--
> >>>
> >>> Can you  elaborate more on the style?  How did you measure the accuracy  
of
> >>> translation? It may be desirable to adopt the English  articles for the
> >>> target wikipedia, than verbatim  translation.
> >>> How much effort was spent to arrive at the above  conclusions?  # of
> >>> articles, # of reviewers, time frame etc  would help.
> >>>
> >>> Thanks
> >>>  Arjun
> >>>
> >>>
> >>>
> >>>
> >>>  _______________________________________________
> >>>  Wikimediaindia-l mailing list
> >>> Wikimediaindia-l at lists.wikimedia.org
> >>> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> >>>
> >>
> >>
> >>  _______________________________________________
> >> Wikimediaindia-l  mailing list
> >> Wikimediaindia-l at lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> >>
> >
> >
> >
> >  --
> > Belayet Hossain
> > http://www.facebook.com/bellayet
> >  http://twitter.com/bellayet
> > http://bellayet.wordpress.com  (Bangla)
> > Knowledge is universal
> >               ...so share  it.
> >
> > Hillel____
> > If I am not for myself, who will be for  me?
> > If I am only for myself, what am I?
> > If not now,  when?
> >
> > _______________________________________________
> >  Wikimedia-BD mailing list
> > Wikimedia-BD at lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikimedia-bd
> >
> >
> 
> _______________________________________________
> Wikimediaindia-l  mailing list
> Wikimediaindia-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> 



More information about the Wikimediaindia-l mailing list