Yes, I wonder if the extension for content translation should be turned
off. Not because it is really bad, but because it allows creating
translations that isn't quite good enough, and those translations creates
fierce internal fighting between contributors.
Some people use CT, and makes fairly good translations. Some are even
excellent, especially some of those based on machine translations through
the Apertium engine. Some are done manually and are usually fairly good,
but those done with the Yandex engine are usually very poor. Sometimes it
seems like the Yandex engine produce so many weird constructs that the
translators simply gives up, but sometimes it also seems like the most
common errors simply passes through. I guess people simply gets used to see
those errors and does not view them as "errors" anymore.
Brute force solution; turn the ContentTranslation off. Really stupid
solution. The next solution; turn the Yandex engine off. That would solve a
part of the problem. Kind of lousy solution though.
What about adding a language model that warns when the language constructs
gets to weird? It is like a "test" for the translation. The CT is used for
creating a translation, but the language model is used for verifying if the
translation is good enough. If it does not validate against the language
model it should simply not be published to the main name space. It will
still be possible to create a draft, but then the user is completely aware
that the translation isn't good enough.
Such a language model should be available as a test for any article, as it
can be used as a quality measure for the article. It is really a quantity
measure for the well-spokenness of the article, but that isn't quite so
intuitive.
The measure could simply be to color code the language constructs after how
common they are, with background color for common constructs in white and
really awful constructs in yellow.
It could also use hints from other measurements, like readability,
confusion and perplexity. Perhaps even such things as punctuation and
markup.
I believe users will get the idea pretty fast; only publish texts that are
"white". It is a bit like tests for developers; they don't publish code
that goes "red".