[Foundation-l] Meeting with Navajo (Dine) Tribal Members on Navajo Machine Translation Project

Jeffrey V. Merkey jmerkey at wolfmountaingroup.com
Mon Sep 18 07:33:37 UTC 2006


I hosted a meeting at my home all day Saturday with the Dine Linguists 
and Tribal members on the Dine Machine Translation Project.

Similair to the Cherokee Language, Dine, while is has over 100,000 
native speakers, his virtually little written materials as tribal
resources. The group was led by Ronnie Greymountain of the Dine Nation 
and several linguists. The following items are in process
and being completed by myself and the Dine team:

1. Unicode templates for Dine Unicode characters:

TEXT

Łł Ńń

Áá Éé Íí Óó Ąą Ęę Įį Ǫǫ Ą́ą́ Ę́ę́ Į́į́ Ǫ́ǫ́ Ǫ́ǫ́

Aa Ee Ii Oo

Bb Cc Dd Gg Hh Jj Kk Ll Mm Nn Ss Tt Ww Yy Zz
Unicode Hex Values for non-ASCII Navajo Letters

Precomposed Composed

glottal stop 02BC none

c unvoiced l 0141 none
s unvoiced l 0142 none

c high n 0143 004E+0301
s high n 0144 006E+0301

------------------------------------------------------------------------------

c high a 00C1 0041+0301
s high a 00E1 0061+0301

c high e 00C9 0045+0301
s high e 00E9 0065+0301

c high i 00CD 0049+0301
s high i 00ED 0069+0301

c high o 00D3 004F+0301
s high o 00F3 006F+0301

------------------------------------------------------------------------------

c nasal a 0104 0041+0328
s nasal a 0105 0061+0328

c nasal e 0118 0045+0328
s nasal e 0119 0065+0328

c nasal i 012E 0049+0328
s nasal i 012F 0069+0328

c nasal o 01EA 004F+0328
s nasal o 01EB 006F+0328

------------------------------------------------------------------------------

c high nasal a none 0104+0301 00C1+0328 0041+0301+0328
s high nasal a none 0105+0301 00E1+0328 0061+0301+0328

c high nasal e none 0118+0301 00C9+0328 0045+0301+0328
s high nasal e none 0119+0301 00E9+0328 0065+0301+0328

c high nasal i none 012E+0301 00CD+0328 0049+0301+0328
s high nasal i none 012F+0301 00ED+0328 0069+0301+0328

c high nasal o none 01EA+0301 00D3+0328 004F+0301+0328
s high nasal o none 01EB+0301 00F3+0328 006F+0301+0328


We are writing a <dine2text> and <text2dine> mediawiki extension to 
allow simple text entry in order to render Navajo Unicode characters
(which are much simpler than Cherokee Syllabary). The specification 
agreed to is:

[char] = high unicode
(char) = nasal unicode
{char} = high nasal
(L/l) = unvoiced L/l
' = glottal stop
The "A" with a circle on top of the chracter is not part of the formal 
unicode for Dine and is not generally used with all dialects, so there 
is debate and continued research on whether it will be adopted. The 
Dine2Text MediaWiki extension will be completed next week and posted to 
meta.

2. Wikipedia nv.wikiepdia.org

This site was reviewed and the tribal members all voiced they liked the 
attempt. As with Cherokee, the Dine name for the site drew some chuckles 
as it is not a real word in Dine. The machine translation site is being 
setup at nv.wikigadugi.org and Ronnie will be meeting with the tribal 
leaders after we get several good runs of the translator.

3. Lexicons and Grammar parser

The Dine team is working on the lexicons and grammar parser for 
Wikitrans and have committed to first run on 30 days with completed 
lexicons and
thesauraus. Dine has a similiar structure to Cherokee in word and 
sentence structure, though it is less structured than Cherokee.

We should have first runs of a completed Dine Wikipedia Machine 
translation mid to end of October. The Dine people will be publishing these
runs at Wikigadugi, then after they receive tribal acceptance, we will 
start moving proofread content to the main Wikipedia site for the Dine 
Language.

There are three main dialects of Dine, but the language drift is minimal 
in comparison to Cherokee and several other native languages due to the 
stability of large number of native speakers in NavajoLand.

Jeff






More information about the foundation-l mailing list