Generally machine translation creates nothing but a mess.
Swwiki has a good community even small - five active contributors and
they are in progress. They have also a good relation to local media, I
heard from one of them.
They know to write what they want to describe. They will translate
some articles from somewhere, but it is apparently nonsense "English
Wikpedia can be transfeered to anywhere without local community
consent".
If you disturb them without any negotiation with them, so you will
know how the global Wikimedia community can react public enemy at that
time, at least what rage and fury of Britomartis will be.
Sincerely,
On 8/29/06, Jeffrey V. Merkey <jmerkey(a)wolfmountaingroup.com> wrote:
The first pass machine translation run of the English Wikipedia into the
Swahili Language has completed and is posted.
The translated XML dumps are posting to :
http://sw.wikigadugi.org
they will post throughout the night.
Lexicons can be downloaded from:
ftp://www.wikigadugi.org/africa/lexicon/swlexicon.public.bz2 - public
swahili lexicon
ftp://www.wikigadugi.org/africa/lexicon/swlexicon.kamusi.bz2 - kamusi
project lexicon
ftp://www.wikigadugi.org/africa/lexicon/sw.thesaurus.bz2 - rogets
thesaurus in swahili
MediaWiki Messages Files:
ftp://www.wikigadugi.org/africa/MediWiki/MessagesSW.php.bz2
Machine Translated XML Dumps against the ewiki-20060817 XMl Dumps from
the English Wikipedia:
ftp://www.wikigadugi.org/africa/xml/swphwiki-20060816-pages-articles.xml.bz2
This first run does NOT employe the verb stem decomposer and conjugator,
does NOT employ the grammar parser or sentence composer, does NOT
employ the AI Inference engine, and does not perform verb or noun
disambiguation as do the other machine translations as I have not
constructed
a decomposition rule set or grammar rules set for the translator. This
first run uses simple word by word translation and phrase matching with
hierarchical
thesaurus lookups and substitution.
This first pass is provided as an illustration of just how rapidly
Wikipedia can be translated into a target language. A swahili grammar
manual has been
overnighted to me and later this week I will perform another run with
grammar and sentence parsing rules. Since I am not a native speaker of
swahili, I request a native speaker to select 20 or more very long
articles and correc them. When I completed the disambiguator and
grammar rules
set for sentence construction, I will use the corrected articles to
teach the AI engine how to reorder and retense the translations. This
should get
the translations over 90% accurracy. Unlike Cherokee, swahili appears
to be a much simpler language for this task.
The Machine translation of swahili is a VERY early first run and is a
work in progress.
Jeffrey V. Merkey
_______________________________________________
foundation-l mailing list
foundation-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l
--
Kizu Naoko
Wikiquote:
http://wikiquote.org
* vivemus, mea Lesbia, amemus *