Well I am crossposting this blog text as well as copying it to some
people in blind copy since it involves various projects - as for
Wikipedia: it is about contents creation for small wikipedias (that's
why I changed the original title above).
The original post is at:
http://sabinecretella.blogspot.com/2006/08/omegat-wiktionaryz-betawiki-someā¦
Your comments and thoughts are very much appreciated.
Best, Sabine
*****
OmegaT, WiktionaryZ, Betawiki ... some questions that need an
answer ...
In the Wiktionary IRC the following questions were made by Connel: "...
considers
omegat.org. Is the intent for it to just auto-upload stuff to
WZ? to/from ZW? Or betawiki, or both betawiki and WZ? Or is betawiki
just for WikiMedia total localization?"
That is a lot ... so let me go step by step.
The intent of OmegaT <http://en.wikipedia.org/wiki/OmegaT> is not to
auto-upload stuff to WiktionaryZ <http://wiktionaryz.org> or download it
from there. Nor is it only there for Betawiki
<http://nike.users.idler.fi/betawiki/Etusivu> and WiktionaryZ, even if
it will probably be used for both sooner or later. OmegaT is a CAT
<http://en.wikipedia.org/wiki/Computer-assisted_translation>-Tool that
helps translators to do their work.
What does this mean: imagine you use for all of your translations a tool
that creates a Translation Memory, a file containing the translations
you did segmented into sentences, combining source and target sentence.
Then you do further translations and let the CAT-Tool access these
already translated files. Now if your translation is of a subject you
already translated chances are high that most terminology needed is
already in there and you can even see in which context it was used. So
with OmegaT you do a search on your project and the available
translation memories to see if and how a term was already translated.
This can help a lot.
Now consider a manual - of a machine, a computer, whatever. These
manuals need updates once a new version of that machine or computer is
produced. Normally companies than also just update the description and
parts of it remain the same as before (simply because the functionality
of these parts is still the same). When you then translate you will find
these parts that are unchanged in your translation memory and depending
on how you set your options OmegaT proposes the 100% match or overwrites
the translation part of your project with the already existing
translations. In this way you can save loads of time.
Having the right parser also the MediaWiki <http://mediawiki.org> UI
could be translated in such a way. Now we always will have people that
translate things manually online and who will not use a CAT. This means
that OmegaT should be able to access the single pages containing the
messages on Betawiki, you translate them on your computer and store them
to the page in the correct language version. This is feasible.
Another use will be: creation of contents for small wikipedias. Once we
get our wiki read/wiki write option within OmegaT it is possible to
start a translation of an article, let's say from the English wikipedia
<http://en.wikipedia.org>, and translate it to any language, let's say
the Neapolitan wikipedia <http://nap.wikipedia.org>. This means you tell
OmegaT which page to get on en.wikipedia <http://en.wikipedia.org> and
which page to write on nap.wikipedia <http://nap.wikipedia.org>. The
same is valid for any African language. The advantage of this is: if
there is no online-connection people can work offline on translations.
The translation memories out of these translations should be stored
(WiktionaryZ is already enabled to upload translation memories)
somewhere in order to allow others to access and use them to be faster
and of higher quality during their own translations. Another aspect of
doing things this way is: the proof reading of a translation is easier
since you see the source text above the translation for each sentence.
This easens the job a lot and the quality of the translated article raises.
Now to WiktionaryZ and OmegaT: OmegaT for now has quite a simple
glossary function - you create a tab separated text file and put it into
your glossary directory. While you translate OmegaT shows you the
translation proposals for the words that are present in that sentence
and in the glossary. Now imagine what that means if you connect the
glossary function to WiktionaryZ: the whole repository of data at your
fingertips - of course: considering the mass of data that is online in
WiktionaryZ it becomes very important to attribute domains to
terminology. Often a word can be translated in 20 ways or even more into
another language ... well, it does not make sense if you are doing a
translation about medical equipment that you get proposals from another
domain, let's say machinery - the possibilities from other domains
should only be proposed (showing that other domain) when there is no
entry for medical equipment.
At this stage we don't have this domain structure for terminology on
WiktionaryZ and therefore the data, once we have loads of it online,
cannot be used - it would just create a huge mess and would be very time
consuming. So one of the things we really nees asap is a domain
structure where we can connect the single terms to - the sooner we have
it the better .... otherwise we will have loads of double and triple
work or WiktionaryZ could become completely useless for the use within
OmegaT and as such it would not be of any advantage for translators. Not
even for scientist really ... imagine a biologist search for terminology
and get whatever result ... also those of machinery or whatever other
domain.
Back to the use within OmegaT:
The next step is then: what if the searched term is not in WiktionaryZ
... I already noted that during my last translation - for now it is too
time consuming to add terms to WiktionaryZ and also Wiktionary when you
wish to do that while you are translating - but: it would make so much
sense. So what is planned in the reference implementation for a
translation glossary
<http://meta.wikimedia.org/wiki/Reference_implementation_for_a_translation_glossary>
is that when working with OmegaT you get the possibility to add such a
term directly from there. You simply tell OmegaT to add it to
WiktionaryZ with your user ID and you can attribute all the necessary
domains etc. without problems as well as tag the term as "definition
needs to be added". What happens in that way is that WiktionaryZ will
get quite a bunch of very specific terminology over time.
Another use is OmegaT for language lessons - Connel, from en.wiktionary
<http://en.wiktionary.org> thought about it and he is right: OmegaT
could be used for language learning as well ... what if we have a huge
sentence repository and people start to translate texts to study that
language - they do not need a paper dictionary - OmegaT would help them
to see the use of a word in various sentences and they would get the
terminology proposals like the translators. When being back at school or
university (or maybe also online with a language teacher) they can
understand their errors, update WiktionaryZ and the online sentence
repository.
For exams teachers would have a mass of proposals and they could
determine which glossary group shall be included in the exams ... that
is to be thought about ... it was not considered up to now even if there
are already thoughts on how to use WiktionaryZ for language learning.
Did I miss something? Hmmm ... not sure. Well if you have questions:
just ask :-)