On August 9, John Vandenberg wrote:
On Sun, Aug 8, 2010 at 2:10 PM, Lars
Aronsson<lars(a)aronsson.se> wrote:
Is there any good free software for aligning
parallel texts and
extracting translations? Looking around, I found NAtools,
TagAligner, and Bitextor, but they require texts to be marked
up already. Are these the best and most modern tools available?
there is a Mediawiki extension which is supposed to provide this:
http://wikisource.org/wiki/Wikisource:DoubleWiki_Extension
It is enabled on all wikisource subdomains.
http://en.wikisource.org/wiki/Crito?match=el
This is a wonderful feature I didn't know about until now.
But it was not what I'm looking for. In computational
linguistics and natural language processing (NLP), a "text
aligner" is a piece of software that identifies which words
and phrases correspond to which in a translation. The
input is a translated text and the output is a dictionary.
It's like a more advanced "diff" tool.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se