Many of the volunteers I work with really like the "Content Translation" tool. Machine translation only exists in some languages. For many it is not an option at all.
Yes peer review processes are needed, and yes doing follow up and inviting new people to our movement is a lot of work (I do a fair bit of it on EN WP with respect to educational efforts). Doing so; however, is important for our long term existence.
James
On Tue, May 2, 2017 at 9:20 AM, John Erling Blad jeblad@gmail.com wrote:
Yes, I wonder if the extension for content translation should be turned off. Not because it is really bad, but because it allows creating translations that isn't quite good enough, and those translations creates fierce internal fighting between contributors.
Some people use CT, and makes fairly good translations. Some are even excellent, especially some of those based on machine translations through the Apertium engine. Some are done manually and are usually fairly good, but those done with the Yandex engine are usually very poor. Sometimes it seems like the Yandex engine produce so many weird constructs that the translators simply gives up, but sometimes it also seems like the most common errors simply passes through. I guess people simply gets used to see those errors and does not view them as "errors" anymore.
Brute force solution; turn the ContentTranslation off. Really stupid solution. The next solution; turn the Yandex engine off. That would solve a part of the problem. Kind of lousy solution though.
What about adding a language model that warns when the language constructs gets to weird? It is like a "test" for the translation. The CT is used for creating a translation, but the language model is used for verifying if the translation is good enough. If it does not validate against the language model it should simply not be published to the main name space. It will still be possible to create a draft, but then the user is completely aware that the translation isn't good enough.
Such a language model should be available as a test for any article, as it can be used as a quality measure for the article. It is really a quantity measure for the well-spokenness of the article, but that isn't quite so intuitive.
The measure could simply be to color code the language constructs after how common they are, with background color for common constructs in white and really awful constructs in yellow.
It could also use hints from other measurements, like readability, confusion and perplexity. Perhaps even such things as punctuation and markup.
I believe users will get the idea pretty fast; only publish texts that are "white". It is a bit like tests for developers; they don't publish code that goes "red". _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe