Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

27 Jul 2013

On Fri, Jul 26, 2013 at 3:25 PM, David Cuenca &lt;dacuetu(a)gmail.com&gt; wrote:

...
  This is the preliminary draft:

 https://meta.wikimedia.org/wiki/Collaborative_Machine_Translation_for_Wikip… 

The linked page says:

...
  For this kind of project it is prefered to use a
rule-based machine
 translation<https://en.wikipedia.org/wiki/en:Rule-based_machine_translat…
system,
 because total control is wanted over the whole process and minority
 languages should be accounted for (not that easy with
statistical-based<https://en.wikipedia.org/wiki/en:Statistical_machine_t…
MT,
 where parallel corpora may be non-existing). 

This statement seems rather defeatist to me.  Step one of a machine
translation effort should be to provide tools to annotate parallel texts in
the various wikis, and to edit and maintain their parallelism.  Once this
is done, you have a substantial parallel corpora, which is then suitable to
grow the set of translated articles.  That is, minority languages ought to
be accounted for by progressively expanding the number of translated
articles in their encyclopedia, as we do now.  As this is done, machine
translation incrementally improves.  If there is not enough of an editor
community to translate articles, I don't see how you will succeed in the
much more technically-demanding tasks of creating rules for a rule-based
translation system.  The beauty of the statistical approach is that little
special ability is needed.
  --scott

-- 
(http://cscott.net)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy