Hi Mark,
What I wonder about TM is, how does it work with languages with different structures?
It's quite obvious TM works well for Russian, Italian, Spanish, French, German, other languages of similar structure. I heard it also works for Chinese, Japanese, Korean, Arabic, Farsi, Hebrew as well.
So my main questions are:
- Can it handle languages which don't separate words in writing?
Examples are Thai, Lao, Japanese, Chinese, and a number of smaller languages.
Yes - there are translators using Thai, Japanese and Chinese within OmegaT - we also have people in the development team that work at least with one of these languages.
- Can it handle languages of all typological classifications? So far
I have seen it works well for isolating (such as Chinese, Vietnamese) and inflecting languages (such as Russian, Polish, Latin), but what about polysynthetic languages (such as Inuktitut, Turkish, Georgian, Adyghe, Abkhaz, Mohawk)? I would imagine it would be more difficult for these languages. For example, Western Greenlandic "Aliikusersuillammassuaanerartassagaluarpaalli." means "However, they will say that he is a great entertainer, but..." (for other long words like this, just look at the greenlandic wikipedia, kl.wp).
Well within OmegaT you have UTF-8 usage - so most languages are supported, for some we might have to try out, others might require special solutions. Basically all that is UTF-8 should not create problems.
- Can it mass-process huge amounts of content quickly, to be reviewed
later by humans?
No - when Talking about OmegaT wer are not talking about machine translation, but computer assisted translation - that means a human translator re-uses translation memories from other projects, exchanged TMs etc. While translating the glossary entries are checked and OmegaT shows you the matching entries in a separate window. Should sentences be equal to former translated ones or similar, according to your settings within the software you can have it just proposed in a separate window or OmegaT can overwrite the sentence to be translated with the full or partial match sentence.
One feature I would very much like to see is assemble from portions, but this will be only at discussion after having it connected to Wiktionaryz, that is when there is tbx support - it does not make sense to talk about this very specific and helpful feature before.
The translation memory you are working with is only as good as you created it. The more you work with it, the better it becomes. That's basically it.
One thing that I also find very helpful: people that speak a language, but are not mothertognue easily can check how a word was translated before - which context etc. So this can help a lot during work and gives better results. Therefore the proof reading effort by mothertognue speakers will be less.
With proper set up segmentation rules, for example, you can go through the born and died people of the calendar quite fast, sinche descriptions are quite repetitive.
Please note: I am having a meeting with a group of colleagues this week-end and next week I am at the university of Pisa to give a presentation and a workshop - so if you write and need answers from me directly, please note it in the subject since it could well be that I then cannot see all posts.
Have a great week-end!
Best, Sabine
___________________________________ Yahoo! Messenger with Voice: chiama da PC a telefono a tariffe esclusive http://it.messenger.yahoo.com