This is a follow-up to the discussions about Google Translate and Translator Toolkit.
One of the problems that quickly arises in discussions about it is that this software is not Free-as-in-Freedom. The Translator Toolkit website is not too complicated, so it's not very important whether it's Free or not, but the stored translations belong to Google and are used by Google to improve their non-Free services. I don't mind Google making money out of my translation efforts, but i am less happy about the fact that, unless i am missing something, the stored translated strings can only be read by Google. Sometimes i will actually want to give up on my privacy and publish the sentence pairs and make them useful to researchers. (And if it is possible to enforce them to use it only in Free software, all the better.)
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
There is definitely a "free TM" project waiting to happen. It would be nice to see translatewiki [for instance] incorporate such a tool, but it may be a nontrivial amount of work.
I know of no general free translation memory that supports working online and sharing your own TM data, for more than very simple short strings.
Also nice would be the ability to track and categorize documents to have different (combinable) TMs for different sources, categories of users so that you can have different TMs for different groups of translators...
SJ
On Thu, Jul 29, 2010 at 4:29 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Дана Thursday 29 July 2010 10:38:20 Samuel Klein написа:
There is definitely a "free TM" project waiting to happen. It would be nice to see translatewiki [for instance] incorporate such a tool, but it may be a nontrivial amount of work.
At Project Rastko for years now there is the idea of building something called Global Translation Project, where volunteers could collaboratively translate texts in a manner somewhat similar to Distributed Proofreaders.
To give some detail: the idea is to first parse the original text with a rule-based machine translation engine (of course this should be free software with free dictionary). The basic problem that these engines have is that they are unable to resolve ambiguities in the text (a classic example is sentence "Time flies like an arrow": does it means that time is flying as fast as an arrow or that there exist some insects called time flies (like there are fruit flies) which like some arrow?). This often ends in a mistranslation.
The crux of the idea is that it would be humans who resolve ambiguities in this step. For example, these two possible meanings of the sentence would in another language be translated to two completely different sentences. A human could then simply pick the correct one. After several people have done this for several independent languages, and their translations agree, the system would know what is the correct parsing of the original text. Then this parsing could be translated fully automatically to a large number of languages, and it will be highly likely that the translations will be close to correct.
An offshoot of this is a crowdsourced dictionary project in GalaxyZoo style. Instead of doing battle with Wiktionary's or similar interface, volunteers could build a dictionary by solving various simple tasks (say, pick a word's gender, or verify that a word is correctly declined); if the supermajority of the volunteers gives the same answer, the word enters the dictionary.
On Fri, 30 Jul 2010 23:22:00 +0200, Nikola Smolenski wrote:
Дана Thursday 29 July 2010 10:38:20 Samuel Klein написа:
There is definitely a "free TM" project waiting to happen. It would be nice to see translatewiki [for instance] incorporate such a tool, but it may be a nontrivial amount of work.
At Project Rastko for years now there is the idea of building something called Global Translation Project, where volunteers could collaboratively translate texts in a manner somewhat similar to Distributed Proofreaders.
To give some detail: the idea is to first parse the original text with a rule-based machine translation engine (of course this should be free software with free dictionary).
Hi. I'm a contributor to Apertium (http://apertium.org), a Free Software RBMT system which... is exactly what you describe.
The basic problem that these engines have is that they are unable to resolve ambiguities in the text (a classic example is sentence "Time flies like an arrow": does it means that time is flying as fast as an arrow or that there exist some insects called time flies (like there are fruit flies) which like some arrow?). This often ends in a mistranslation.
The crux of the idea is that it would be humans who resolve ambiguities in this step. For example, these two possible meanings of the sentence would in another language be translated to two completely different sentences. A human could then simply pick the correct one. After several people have done this for several independent languages, and their translations agree, the system would know what is the correct parsing of the original text. Then this parsing could be translated fully automatically to a large number of languages, and it will be highly likely that the translations will be close to correct.
Apertium has a sister project, Tradubi (http://tradubi.com), which is developing exactly this.
An offshoot of this is a crowdsourced dictionary project in GalaxyZoo style. Instead of doing battle with Wiktionary's or similar interface, volunteers could build a dictionary by solving various simple tasks (say, pick a word's gender, or verify that a word is correctly declined); if the supermajority of the volunteers gives the same answer, the word enters the dictionary.
also, wikipedia.org comes very close, but it has been polluted by all those people who want to change the content a bit to their local situation to make the text better understandable...
lodewijk
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Thu, Jul 29, 2010 at 10:29 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
Apertium: http://www.apertium.org/
Also, a project connected with OmegaWiki (Gerard, please, help! -- Omega<something> ), as well as OmegaWiki itself.
2010/7/29 Milos Rancic millosh@gmail.com:
On Thu, Jul 29, 2010 at 10:29 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
Apertium: http://www.apertium.org/
I know that Apertium is a Free translation engine originally centered around Catalan and Spanish and later enhanced to other languages. I tried to look for a translation memory storage service at its website and didn't find anything. So, unless i am missing something, this project is probably using translation memory internally, but i can't find a way to upload my pairs of translated texts there.
(Having studied Catalan pretty well, i really should take a better look at Apertium in any case.)
Hoi, Apertium provides machine translation. It does not have a translation memory. Thanks, GerardM
On 30 July 2010 12:49, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Milos Rancic millosh@gmail.com:
On Thu, Jul 29, 2010 at 10:29 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
Apertium: http://www.apertium.org/
I know that Apertium is a Free translation engine originally centered around Catalan and Spanish and later enhanced to other languages. I tried to look for a translation memory storage service at its website and didn't find anything. So, unless i am missing something, this project is probably using translation memory internally, but i can't find a way to upload my pairs of translated texts there.
(Having studied Catalan pretty well, i really should take a better look at Apertium in any case.)
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Gerard Meijssen <gerard.meijssen@...> writes:
Hoi, Apertium provides machine translation. It does not have a translation memory. Thanks, GerardM
However, see http://ur1.ca/0ykh1 on how you can combine the two.
best regards, Kevin Brubeck Unhammer
On Fri, 30 Jul 2010 13:49:21 +0300, Amir E. Aharoni wrote:
2010/7/29 Milos Rancic millosh@gmail.com:
On Thu, Jul 29, 2010 at 10:29 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
Apertium: http://www.apertium.org/
I know that Apertium is a Free translation engine originally centered around Catalan and Spanish and later enhanced to other languages.
Correct, though our range of languages is a lot larger and more diverse :)
FWIW, one of our most heavily-used language pairs is Norwegian Nynorsk- Bokmål, and a large portion of that use is by Wikipedia contributors.
I tried to look for a translation memory storage service at its website and didn't find anything. So, unless i am missing something, this project is probably using translation memory internally, but i can't find a way to upload my pairs of translated texts there.
Correct. We don't currently provide a way to add your own translation memory via the website -- the feature is available if you install the software locally (apt-get install apertium on Debian and Ubuntu), or via Tradubi (http://www.tradubi.com). A GSoC student is working on a web- based post-editing environment, so the feature may become available from the Apertium site in the future.
For the moment, if you want a web-based environment, with the ability to add and create your own translation memory, use Tradubi.
(Having studied Catalan pretty well, i really should take a better look at Apertium in any case.)
All contributions welcome :)
On Thu, 29 Jul 2010 11:29:41 +0300, Amir E. Aharoni wrote:
2010/7/29 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
... Thinking out loud / replying to myself - translatewiki.net comes very close, but people are used to think about it as a tool for translating software messages and not for translating general texts. Maybe it can be adopted to that.
Open-Tran: http://open-tran.eu/ Is something like translatewiki. Software here: http://code.google.com/p/open-tran/ They also provide their databases for download.
For running your own server:
TinyTM: http://tinytm.sourceforge.net/
Translate Toolkit includes an XML-RPC based translation memory server.
On Sat, Jul 31, 2010 at 7:47 PM, Jimmy O'Regan joregan@gmail.com wrote:
Open-Tran: http://open-tran.eu/ Is something like translatewiki. Software here: http://code.google.com/p/open-tran/ They also provide their databases for download. For running your own server: TinyTM: http://tinytm.sourceforge.net/ Translate Toolkit includes an XML-RPC based translation memory server.
The idea here is interesting as Google is using Wikipedia articles to improve its translation tools, which remain proprietary. If Wikimedia ran its own translation toolkit, would that change the paradigm a bit? Would that somehow compel Google to open up its translation data to us, as an exchange for using our content for its proprietary tools?
-SC
On 1 August 2010 04:08, stevertigo stvrtg@gmail.com wrote:
On Sat, Jul 31, 2010 at 7:47 PM, Jimmy O'Regan joregan@gmail.com wrote:
Open-Tran: http://open-tran.eu/ Is something like translatewiki. Software here: http://code.google.com/p/open-tran/ They also provide their databases for download. For running your own server: TinyTM: http://tinytm.sourceforge.net/ Translate Toolkit includes an XML-RPC based translation memory server.
The idea here is interesting as Google is using Wikipedia articles to improve its translation tools, which remain proprietary. If Wikimedia ran its own translation toolkit, would that change the paradigm a bit?
Probably. Same reason cloning reCaptcha would be a good idea.
Would that somehow compel Google to open up its translation data to us, as an exchange for using our content for its proprietary tools?
No.
- d.
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing? I heard about OmegaT, but if i understand correctly, it is a local application that doesn't offer online storage and sharing - but correct me if i'm wrong. Are there any other Free-minded translation memory services?
I found something here:
http://meta.wikimedia.org/wiki/Wikipedia_Machine_Translation_Project#Existin...
Add methods, products and projects
przykuta
Amir E. Aharoni, 29/07/2010 10:17:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing?
I've added an entry to http://strategy.wikimedia.org/wiki/List_of_things_that_need_to_be_free table, you could write a paragraph to elaborate a bit.
Nemo
2010/7/31 Federico Leva (Nemo) nemowiki@gmail.com:
Amir E. Aharoni, 29/07/2010 10:17:
Is there a Free competitor to the Google Translator Toolkit in terms of online storage and sharing?
I've added an entry to http://strategy.wikimedia.org/wiki/List_of_things_that_need_to_be_free table, you could write a paragraph to elaborate a bit.
Thanks. I expanded it into a whole page: http://strategy.wikimedia.org/wiki/Proposal:Free_Translation_Memory .
wikimedia-l@lists.wikimedia.org