I hosted a meeting at my home all day Saturday with the Dine Linguists and Tribal members on the Dine Machine Translation Project.
Similair to the Cherokee Language, Dine, while is has over 100,000 native speakers, his virtually little written materials as tribal resources. The group was led by Ronnie Greymountain of the Dine Nation and several linguists. The following items are in process and being completed by myself and the Dine team:
1. Unicode templates for Dine Unicode characters:
TEXT
Łł Ńń
Áá Éé Íí Óó Ąą Ęę Įį Ǫǫ Ą́ą́ Ę́ę́ Į́į́ Ǫ́ǫ́ Ǫ́ǫ́
Aa Ee Ii Oo
Bb Cc Dd Gg Hh Jj Kk Ll Mm Nn Ss Tt Ww Yy Zz Unicode Hex Values for non-ASCII Navajo Letters
Precomposed Composed
glottal stop 02BC none
c unvoiced l 0141 none s unvoiced l 0142 none
c high n 0143 004E+0301 s high n 0144 006E+0301
------------------------------------------------------------------------------
c high a 00C1 0041+0301 s high a 00E1 0061+0301
c high e 00C9 0045+0301 s high e 00E9 0065+0301
c high i 00CD 0049+0301 s high i 00ED 0069+0301
c high o 00D3 004F+0301 s high o 00F3 006F+0301
------------------------------------------------------------------------------
c nasal a 0104 0041+0328 s nasal a 0105 0061+0328
c nasal e 0118 0045+0328 s nasal e 0119 0065+0328
c nasal i 012E 0049+0328 s nasal i 012F 0069+0328
c nasal o 01EA 004F+0328 s nasal o 01EB 006F+0328
------------------------------------------------------------------------------
c high nasal a none 0104+0301 00C1+0328 0041+0301+0328 s high nasal a none 0105+0301 00E1+0328 0061+0301+0328
c high nasal e none 0118+0301 00C9+0328 0045+0301+0328 s high nasal e none 0119+0301 00E9+0328 0065+0301+0328
c high nasal i none 012E+0301 00CD+0328 0049+0301+0328 s high nasal i none 012F+0301 00ED+0328 0069+0301+0328
c high nasal o none 01EA+0301 00D3+0328 004F+0301+0328 s high nasal o none 01EB+0301 00F3+0328 006F+0301+0328
We are writing a <dine2text> and <text2dine> mediawiki extension to allow simple text entry in order to render Navajo Unicode characters (which are much simpler than Cherokee Syllabary). The specification agreed to is:
[char] = high unicode (char) = nasal unicode {char} = high nasal (L/l) = unvoiced L/l ' = glottal stop The "A" with a circle on top of the chracter is not part of the formal unicode for Dine and is not generally used with all dialects, so there is debate and continued research on whether it will be adopted. The Dine2Text MediaWiki extension will be completed next week and posted to meta.
2. Wikipedia nv.wikiepdia.org
This site was reviewed and the tribal members all voiced they liked the attempt. As with Cherokee, the Dine name for the site drew some chuckles as it is not a real word in Dine. The machine translation site is being setup at nv.wikigadugi.org and Ronnie will be meeting with the tribal leaders after we get several good runs of the translator.
3. Lexicons and Grammar parser
The Dine team is working on the lexicons and grammar parser for Wikitrans and have committed to first run on 30 days with completed lexicons and thesauraus. Dine has a similiar structure to Cherokee in word and sentence structure, though it is less structured than Cherokee.
We should have first runs of a completed Dine Wikipedia Machine translation mid to end of October. The Dine people will be publishing these runs at Wikigadugi, then after they receive tribal acceptance, we will start moving proofread content to the main Wikipedia site for the Dine Language.
There are three main dialects of Dine, but the language drift is minimal in comparison to Cherokee and several other native languages due to the stability of large number of native speakers in NavajoLand.
Jeff
wikimedia-l@lists.wikimedia.org