Hoi, An error rate is expected when people add content, this is true for any project. Given that the information is based on what other Wiktionaries offer, from a Wikimedia point of view, no new errors are introduced. What we do not have atm is a process where translations are shared in one database. This is possible at OmegaWiki but that is outside of WMF.
From my perspective, a good effort as with any project it has its flaws.
Thanks, GerardM
On Thu, 17 Sep 2020 at 20:54, MF-Warburg mfwarburg@googlemail.com wrote:
FYI: https://meta.wikimedia.org/wiki/Small_wiki_audit/Malagasy_Wiktionary https://meta.wikimedia.org/wiki/Talk:Small_wiki_audit/Malagasy_Wiktionary _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
I am surprised you consider it "people add content", Gerard. It is explicitly *not* people adding content, but a bot using machine translation. Machine translation is problematic enough for just reading some text, but a machine-translated *dictionary* is literally worse than nothing. It is a travesty, and it is better to *not* offer dictionary entries in Malagasy than to offer machine-translated ones with zero human supervision.
I encourage this committee to consider whether it is beneficial to the mission to allow this to continue.
A.
Asaf Bartov (he/him/his)
Senior Program Officer, Emerging Wikimedia Communities
Wikimedia Foundation https://wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
On Fri, Sep 18, 2020 at 12:18 PM Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, An error rate is expected when people add content, this is true for any project. Given that the information is based on what other Wiktionaries offer, from a Wikimedia point of view, no new errors are introduced. What we do not have atm is a process where translations are shared in one database. This is possible at OmegaWiki but that is outside of WMF.
From my perspective, a good effort as with any project it has its flaws. Thanks, GerardM
On Thu, 17 Sep 2020 at 20:54, MF-Warburg mfwarburg@googlemail.com wrote:
FYI: https://meta.wikimedia.org/wiki/Small_wiki_audit/Malagasy_Wiktionary https://meta.wikimedia.org/wiki/Talk:Small_wiki_audit/Malagasy_Wiktionary _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
On Fri, 18 Sep 2020 at 16:33, Asaf Bartov abartov@wikimedia.org wrote:
I am surprised you consider it "people add content", Gerard. It is explicitly *not* people adding content, but a bot using machine translation. Machine translation is problematic enough for just reading some text, but a machine-translated *dictionary* is literally worse than nothing. It is a travesty, and it is better to *not* offer dictionary entries in Malagasy than to offer machine-translated ones with zero human supervision.
I encourage this committee to consider whether it is beneficial to the mission to allow this to continue.
A.
Asaf Bartov (he/him/his)
Senior Program Officer, Emerging Wikimedia Communities
Wikimedia Foundation https://wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
On Fri, Sep 18, 2020 at 12:18 PM Gerard Meijssen < gerard.meijssen@gmail.com> wrote:
Hoi, An error rate is expected when people add content, this is true for any project. Given that the information is based on what other Wiktionaries offer, from a Wikimedia point of view, no new errors are introduced. What we do not have atm is a process where translations are shared in one database. This is possible at OmegaWiki but that is outside of WMF.
From my perspective, a good effort as with any project it has its flaws. Thanks, GerardM
On Thu, 17 Sep 2020 at 20:54, MF-Warburg mfwarburg@googlemail.com wrote:
FYI: https://meta.wikimedia.org/wiki/Small_wiki_audit/Malagasy_Wiktionary https://meta.wikimedia.org/wiki/Talk:Small_wiki_audit/Malagasy_Wiktionary _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
"Perfect" is not good's only enemy, Gerard. "Terrible" is one as well. And this definitely seems on the terrible side.
fre. 18. sep. 2020, 17:12 skrev Gerard Meijssen gerard.meijssen@gmail.com:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
On Fri, 18 Sep 2020 at 16:33, Asaf Bartov abartov@wikimedia.org wrote:
I am surprised you consider it "people add content", Gerard. It is explicitly *not* people adding content, but a bot using machine translation. Machine translation is problematic enough for just reading some text, but a machine-translated *dictionary* is literally worse than nothing. It is a travesty, and it is better to *not* offer dictionary entries in Malagasy than to offer machine-translated ones with zero human supervision.
I encourage this committee to consider whether it is beneficial to the mission to allow this to continue.
A.
Asaf Bartov (he/him/his)
Senior Program Officer, Emerging Wikimedia Communities
Wikimedia Foundation https://wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
On Fri, Sep 18, 2020 at 12:18 PM Gerard Meijssen < gerard.meijssen@gmail.com> wrote:
Hoi, An error rate is expected when people add content, this is true for any project. Given that the information is based on what other Wiktionaries offer, from a Wikimedia point of view, no new errors are introduced. What we do not have atm is a process where translations are shared in one database. This is possible at OmegaWiki but that is outside of WMF.
From my perspective, a good effort as with any project it has its flaws. Thanks, GerardM
On Thu, 17 Sep 2020 at 20:54, MF-Warburg mfwarburg@googlemail.com wrote:
FYI: https://meta.wikimedia.org/wiki/Small_wiki_audit/Malagasy_Wiktionary
https://meta.wikimedia.org/wiki/Talk:Small_wiki_audit/Malagasy_Wiktionary _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Hoi, Wiktionaries true strength is not so much in its definitions, it is in the translations that exist for a concept. When you take a concept in any language, you have to have something to base it on. The words for a concept in another language are always and at best similar to what was originally defined for a concept in the original language. So when you take the translations as used for a concept with a translated definition you have something that is useful because it has its value against the labels, the words for that concept in all the other languages.
The point is that most often a decent machine translation gets there most of the time. Without it we offer nothing at all.
As a movement we are terrible at supporting other languages. We don't really. What we have is mostly a stamp collection and our support is that you can translate from English. When we want dictionary services in all our languages, we have to be smart about it. We are not smart about it, that has been our choice. When we have a tool like Commons with 64.236.643 freely usable media files, we have hidden it in English. Now that we can open up its use in other languages we don't.
That is what is terrible. Thanks, GerardM
On Fri, 18 Sep 2020 at 21:17, Jon Harald Søby jhsoby@gmail.com wrote:
"Perfect" is not good's only enemy, Gerard. "Terrible" is one as well. And this definitely seems on the terrible side.
fre. 18. sep. 2020, 17:12 skrev Gerard Meijssen <gerard.meijssen@gmail.com
:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
On Fri, 18 Sep 2020 at 16:33, Asaf Bartov abartov@wikimedia.org wrote:
I am surprised you consider it "people add content", Gerard. It is explicitly *not* people adding content, but a bot using machine translation. Machine translation is problematic enough for just reading some text, but a machine-translated *dictionary* is literally worse than nothing. It is a travesty, and it is better to *not* offer dictionary entries in Malagasy than to offer machine-translated ones with zero human supervision.
I encourage this committee to consider whether it is beneficial to the mission to allow this to continue.
A.
Asaf Bartov (he/him/his)
Senior Program Officer, Emerging Wikimedia Communities
Wikimedia Foundation https://wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
On Fri, Sep 18, 2020 at 12:18 PM Gerard Meijssen < gerard.meijssen@gmail.com> wrote:
Hoi, An error rate is expected when people add content, this is true for any project. Given that the information is based on what other Wiktionaries offer, from a Wikimedia point of view, no new errors are introduced. What we do not have atm is a process where translations are shared in one database. This is possible at OmegaWiki but that is outside of WMF.
From my perspective, a good effort as with any project it has its flaws. Thanks, GerardM
On Thu, 17 Sep 2020 at 20:54, MF-Warburg mfwarburg@googlemail.com wrote:
FYI: https://meta.wikimedia.org/wiki/Small_wiki_audit/Malagasy_Wiktionary
https://meta.wikimedia.org/wiki/Talk:Small_wiki_audit/Malagasy_Wiktionary _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
How are bad translations a valid resource?
Am Sa., 19. Sept. 2020 um 12:01 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
https://en.wiktionary.org/wiki/valid#Translations
On Sun, 20 Sep 2020 at 17:39, MF-Warburg mfwarburg@googlemail.com wrote:
How are bad translations a valid resource?
Am Sa., 19. Sept. 2020 um 12:01 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Perhaps at this point the rest of the committee could share thoughts and move towards a decision.
I submit further attempts to point out the problem to Gerard would have diminishing returns.
A.
On Mon, 21 Sep 2020, 08:18 Gerard Meijssen gerard.meijssen@gmail.com wrote:
https://en.wiktionary.org/wiki/valid#Translations
On Sun, 20 Sep 2020 at 17:39, MF-Warburg mfwarburg@googlemail.com wrote:
How are bad translations a valid resource?
Am Sa., 19. Sept. 2020 um 12:01 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation. Now I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it is much better than providing nothing.
The biggest problem I have with the notion of perfection is that it is the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted because "the next iteration will be even better". It also shows the extend we have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Hoi, I understand the problem. The definitions are generated by machine translations. This is presented as an unredeemable issue and I disagree with that. I have provided arguments why this is not great but not something that makes it unusable. You do not need to agree with me and you are invited to convince me. But so far there has been a lack of arguments because it is thought to be obvious.
As you may know, I have quite a lot of experience with Wiktionary and with lexical work in a digital setting. Thanks, GerardM
On Mon, 21 Sep 2020 at 15:08, Asaf Bartov abartov@wikimedia.org wrote:
Perhaps at this point the rest of the committee could share thoughts and move towards a decision.
I submit further attempts to point out the problem to Gerard would have diminishing returns.
A.
On Mon, 21 Sep 2020, 08:18 Gerard Meijssen gerard.meijssen@gmail.com wrote:
https://en.wiktionary.org/wiki/valid#Translations
On Sun, 20 Sep 2020 at 17:39, MF-Warburg mfwarburg@googlemail.com wrote:
How are bad translations a valid resource?
Am Sa., 19. Sept. 2020 um 12:01 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation.
Now
I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it
is
much better than providing nothing.
The biggest problem I have with the notion of perfection is that it
is
the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted
because
"the next iteration will be even better". It also shows the extend
we
have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
My recommendation to Langcom is to participate in the laudable community discussion which aims at removing the erroneous pages.
Am Mo., 21. Sept. 2020 um 15:08 Uhr schrieb Asaf Bartov < abartov@wikimedia.org>:
Perhaps at this point the rest of the committee could share thoughts and move towards a decision.
I submit further attempts to point out the problem to Gerard would have diminishing returns.
A.
On Mon, 21 Sep 2020, 08:18 Gerard Meijssen gerard.meijssen@gmail.com wrote:
https://en.wiktionary.org/wiki/valid#Translations
On Sun, 20 Sep 2020 at 17:39, MF-Warburg mfwarburg@googlemail.com wrote:
How are bad translations a valid resource?
Am Sa., 19. Sept. 2020 um 12:01 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, First, a specific spelling stands for an article. It can be in any language. Each lemma in Wiktionary has its own translations. So you can do without descriptions and still have meaningful information. Yes, that only works when you are at least bilingual.
When a bot moves data between Wiktionaries, the validity of these translations exists because of it being moved from one Wiktionary to another.
What is sad is that this is not understood or considered as a valid resource. Thanks, GerardM
On Sat, 19 Sep 2020 at 11:53, Jan Wohlgemuth linguist@spamcop.net wrote:
Gerard and others, hello greetings from the new guy.
I have to object an "anything is better than nothing" argument. Let's just say the bot accesses an article "fork" and takes the first definition. With some luck, something like "a tool for eating" will then be translated as definition. That leaves out all other meanings of "fork", like when a road splits up into two, but ok, that is a completely different thread of discussion. But if "a tool for eating" becomes the new lemma instead of the translated definition, that's when the entries start becoming unusable, especially if translated again and again. The bot programmer's fallacy is that there are 1-on-1 equivalents in translation. Sometimes there are, more often there are not. Automated "translations" liek the ones used in this case can not pick up on one-to-many relations and can not adequately post them. Another thing is register of synonyms. We certainly do not want any curse words to be listed as the general term for certain body parts etc. This needs to be verified by people who speak both languages or at least can make sure the entry makes sense in the metalanguage (here Malagasy). The review has shown that the output of these bot "translations" in Malagasy Wiktionary are not good. Some of them might be acceptable (by chance), but the majority must be considered questionable. The least that should be done is mark them as unpatrolled bot translations and hope that some speaker can check the accuracy.
Greetings from Depok, Jan (Janwo)
Am 18.09.2020 22:12, schrieb Gerard Meijssen:
Asaf, That is not how I understand it. First, I do not mind bots. When Wiktionaries have information on words in Malagasy, I am perfectly happy for the translations to be copied from one Wiktionary to another. When the descriptions are translated using machine translation, the question becomes only slightly different.
The question becomes about the quality of the machine translation.
Now
I do not mind key words in Malagasy without definitions. With dodgy translations it is ok because it is still better than providing nothing. When the quality of the machine translation is such that it is understandable but not quite there, I am of the opinion that it
is
much better than providing nothing.
The biggest problem I have with the notion of perfection is that it
is
the enemy of the good. The good is to provide the best we can offer. When it needs work, it is acceptable because it is a wiki.
The biggest problem with language support is that products that are perfectly functional like Special:MediaSearch are not promoted
because
"the next iteration will be even better". It also shows the extend
we
have moved away from our Wiki roots.
The notion that a bot operator is not people... really... Thanks, GerardM
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom