I don't think it is useful to discuss projects and people, discuss processes and fixes.
On Wed, May 3, 2017 at 12:15 AM, Lodewijk lodewijk@effeietsanders.org wrote:
Hi John,
Could you provide a bit more context? From which language are you drawing these experiences? Did you consider filing a phabricator request for the technical component that can be improved (if so, could you link to it)? Could you also provide some links to these discussions that are causing the internal fighting you refer to?
I'd be curious to understand better what you're talking about before taking a position. Thanks!
Best, Lodewijk
2017-05-02 17:20 GMT+02:00 John Erling Blad jeblad@gmail.com:
Yes, I wonder if the extension for content translation should be turned off. Not because it is really bad, but because it allows creating translations that isn't quite good enough, and those translations creates fierce internal fighting between contributors.
Some people use CT, and makes fairly good translations. Some are even excellent, especially some of those based on machine translations through the Apertium engine. Some are done manually and are usually fairly good, but those done with the Yandex engine are usually very poor. Sometimes it seems like the Yandex engine produce so many weird constructs that the translators simply gives up, but sometimes it also seems like the most common errors simply passes through. I guess people simply gets used to
see
those errors and does not view them as "errors" anymore.
Brute force solution; turn the ContentTranslation off. Really stupid solution. The next solution; turn the Yandex engine off. That would
solve a
part of the problem. Kind of lousy solution though.
What about adding a language model that warns when the language
constructs
gets to weird? It is like a "test" for the translation. The CT is used
for
creating a translation, but the language model is used for verifying if
the
translation is good enough. If it does not validate against the language model it should simply not be published to the main name space. It will still be possible to create a draft, but then the user is completely
aware
that the translation isn't good enough.
Such a language model should be available as a test for any article, as
it
can be used as a quality measure for the article. It is really a quantity measure for the well-spokenness of the article, but that isn't quite so intuitive.
The measure could simply be to color code the language constructs after
how
common they are, with background color for common constructs in white and really awful constructs in yellow.
It could also use hints from other measurements, like readability, confusion and perplexity. Perhaps even such things as punctuation and markup.
I believe users will get the idea pretty fast; only publish texts that
are
"white". It is a bit like tests for developers; they don't publish code that goes "red". _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe