Hi,
Going into rant mode...
It's been at least 6 months that CX (Content Translation) has been deployed on production wikis, and after all this time, most of the articles created with this tool contain many syntax problems. Phabricator tasks have been created months ago, and almost nothing seems to be done to fix this.
So, is there any plan to deactivate CX until major bugs are fixed (to stop the creation of damaged articles) or any plan to fix the bugs quickly ? I tried posting on the talk page for CX with this kind of questions months ago : the answer was that the bug were almost fixed. Several months after that : same situation, bugs still here, even with some new ones...
Examples by taking the last 5 CX edits on frwiki :
- David Borwein https://fr.wikipedia.org/w/index.php?title=David_Borwein&action=edit&oldid=119968179 : almost no problems, but because the editor translated only one sentence, the article is otherwise completely in English. Even with no edits, basic problem of templates called with the {{Modèle:...}} prefix (equivalent to {{Template:...) in English) - Grande Riviere https://fr.wikipedia.org/w/index.php?title=Grande_Riviere&action=edit&oldid=119965431 : many problems : template prefix, nowiki tags in bad places, several references with the same name and the same content duplicated (the goal of the name is to have the content once, not in every reference), whitespace included at the end of internal links (reason for some nowiki tags), coordinates so badly handled that it results in several lines of span tags and complex code - Jozef Gregor-Tajovsky https://fr.wikipedia.org/w/index.php?title=Jozef_Gregor-Tajovsk%C3%BD&action=edit&oldid=119961320 : less than 500 bytes, but with stub category added directly instead of the templates that should add them - Silva Semadeni https://fr.wikipedia.org/w/index.php?title=Silva_Semadeni&action=edit&oldid=119959038 : trailing punctuation included in internal links, unnecessary div tags, internal links with only nowiki tags as the displayed text (so invisible links), unnecessary span tags, preceding whitespace included in internal links - David Steel https://fr.wikipedia.org/w/index.php?title=David_Steel&action=edit&oldid=119957633 : almost empty, but with internal CX data added
5 articles checked, not one correct.
Nico
2015-10-29 10:47 GMT+02:00 Nicolas Vervelle nvervelle@gmail.com:
Hi,
Going into rant mode...
It's been at least 6 months that CX (Content Translation) has been deployed on production wikis, and after all this time, most of the articles created with this tool contain many syntax problems. Phabricator tasks have been created months ago, and almost nothing seems to be done to fix this.
So, is there any plan to deactivate CX until major bugs are fixed (to stop the creation of damaged articles) or any plan to fix the bugs quickly ? I tried posting on the talk page for CX with this kind of questions months ago : the answer was that the bug were almost fixed. Several months after that : same situation, bugs still here, even with some new ones...
Examples by taking the last 5 CX edits on frwiki :
- David Borwein
https://fr.wikipedia.org/w/index.php?title=David_Borwein&action=edit&oldid=119968179 : almost no problems, but because the editor translated only one sentence, the article is otherwise completely in English. Even with no edits, basic problem of templates called with the {{Modèle:...}} prefix (equivalent to {{Template:...) in English)
Nicolas, AFAIK this does not happen by default. The user must click on the paragraph to have it moved to the translation. So this is hardly a CX issue.
Can't comment on the rest of the issues, but I can add one more: I've had several reports for ro.wp of "lost" translations. Unfortunately I don't have enough data to log a bug. They just said: "my translations were lost". Perhaps some performance issues with the server or some patchy Internet connections and not enough caching?
Strainu
- Grande Riviere
https://fr.wikipedia.org/w/index.php?title=Grande_Riviere&action=edit&oldid=119965431 : many problems : template prefix, nowiki tags in bad places, several references with the same name and the same content duplicated (the goal of the name is to have the content once, not in every reference), whitespace included at the end of internal links (reason for some nowiki tags), coordinates so badly handled that it results in several lines of span tags and complex code
- Jozef Gregor-Tajovsky
https://fr.wikipedia.org/w/index.php?title=Jozef_Gregor-Tajovsk%C3%BD&action=edit&oldid=119961320 : less than 500 bytes, but with stub category added directly instead of the templates that should add them
- Silva Semadeni
https://fr.wikipedia.org/w/index.php?title=Silva_Semadeni&action=edit&oldid=119959038 : trailing punctuation included in internal links, unnecessary div tags, internal links with only nowiki tags as the displayed text (so invisible links), unnecessary span tags, preceding whitespace included in internal links
- David Steel
https://fr.wikipedia.org/w/index.php?title=David_Steel&action=edit&oldid=119957633 : almost empty, but with internal CX data added
5 articles checked, not one correct.
Nico _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Oct 29, 2015 at 11:09 AM, Strainu strainu10@gmail.com wrote:
Can't comment on the rest of the issues, but I can add one more: I've had several reports for ro.wp of "lost" translations. Unfortunately I don't have enough data to log a bug. They just said: "my translations were lost". Perhaps some performance issues with the server or some patchy Internet connections and not enough caching?
One example falling in this category maybe : In this edit https://fr.wikipedia.org/w/index.php?title=Magnitude_AB&type=revision&diff=119959386&oldid=119958774, done to fix some of CX errors, the user commented his edit as "What's the point of displaying this in CX if it's not even included in the final result"...
Nico
On Thu, Oct 29, 2015 at 11:09 AM, Strainu strainu10@gmail.com wrote:
2015-10-29 10:47 GMT+02:00 Nicolas Vervelle nvervelle@gmail.com:
- David Borwein
<
https://fr.wikipedia.org/w/index.php?title=David_Borwein&action=edit&...
: almost no problems, but because the editor translated only one
sentence,
the article is otherwise completely in English. Even with no edits,
basic
problem of templates called with the {{Modèle:...}} prefix
(equivalent to
{{Template:...) in English)
Nicolas, AFAIK this does not happen by default. The user must click on the paragraph to have it moved to the translation. So this is hardly a CX issue.
Well, it seems to happen a lot : only one more CX edit on frwiki since my first post : https://fr.wikipedia.org/w/index.php?title=Aradjamough&action=edit&o... It's not translated at all...An ergonomic problem then ? But even without any translation it contains problems: big mess with coordinates, templates prefix.
Nico
2015-10-29 12:28 GMT+02:00 Nicolas Vervelle nvervelle@gmail.com:
On Thu, Oct 29, 2015 at 11:09 AM, Strainu strainu10@gmail.com wrote:
2015-10-29 10:47 GMT+02:00 Nicolas Vervelle nvervelle@gmail.com:
- David Borwein
<
https://fr.wikipedia.org/w/index.php?title=David_Borwein&action=edit&...
: almost no problems, but because the editor translated only one
sentence,
the article is otherwise completely in English. Even with no edits,
basic
problem of templates called with the {{Modèle:...}} prefix
(equivalent to
{{Template:...) in English)
Nicolas, AFAIK this does not happen by default. The user must click on the paragraph to have it moved to the translation. So this is hardly a CX issue.
Well, it seems to happen a lot : only one more CX edit on frwiki since my first post : https://fr.wikipedia.org/w/index.php?title=Aradjamough&action=edit&o... It's not translated at all...An ergonomic problem then ?
Very likely: I see these are users with a bit of Wikipedia experience, so they might be used to the old way of translating, where one would take the whole text and then translate it.
Perhaps a warning on clicking "Publish translation" would help? Something like "This will publish your translation for everyone to see. If you simply plan to work on it later, the translation has been automatically saved."
A "Save draft" button would also be good.
Strainu
But even without any translation it contains problems: big mess with coordinates, templates prefix.
Nico _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Oct 29, 2015 at 3:58 PM, Nicolas Vervelle nvervelle@gmail.com wrote:
Well, it seems to happen a lot : only one more CX edit on frwiki since my first post :
https://fr.wikipedia.org/w/index.php?title=Aradjamough&action=edit&o... It's not translated at all...An ergonomic problem then ? But even without any translation it contains problems: big mess with coordinates, templates prefix.
Hello Nico,
Thanks again for bringing this up. Most of the tags and syntax related issues are already in phabricator[1] and we are actively monitoring the remaining issues. Recently we have adjusted the configuration of our parsing engine to produce much clean wikitext[2]. This represents a general improvement in this area, but there are still specific issues to be solved.
For our upcoming release cycle, stability and reliability will be key areas we plan to work on. We have been collecting the list of blocking bugs from the community that will move the tool to its next level of maturity [3], and producing clean wikitext is also an important area as the tool gets exposed to a wider audience. For this reason, we highly appreciate all reports about tag related failures that makes us aware of the possible anomalies.
The tool is in active development and despite the issues reported, we still think that the overall balance of the content contributed is still positive. From our observation, users translating content with Content Translation are normally editing the articles after creation to improve them, and the deletion ratio is much lower compared to that of articles created from scratch. The purpose of the tool is to facilitate the creation of those first versions to be later evolved. Disabling the tool would also impact the workflow of many users who chose to use Content Translation because it saves them time.
[1] https://phabricator.wikimedia.org/T111155 [2] https://www.mediawiki.org/wiki/Parsoid/Normalizations#scrubWikitext [3] https://phabricator.wikimedia.org/T102107
On Thu, Oct 29, 2015 at 3:39 PM, Strainu strainu10@gmail.com wrote:
Can't comment on the rest of the issues, but I can add one more: I've
had several reports for ro.wp of "lost" translations. Unfortunately I don't have enough data to log a bug. They just said: "my translations were lost". Perhaps some performance issues with the server or some patchy Internet connections and not enough caching?
Strainu,
Thanks for mentioning the saving failures. It’s currently one of the most intriguing problems that we are investigating. Please feel free to report the cases from Romanian Wikipedia in this ticket: https://phabricator.wikimedia.org/T116908 . The cause for the failures have been rather inconsistent and finding the appropriate solution remains elusive at the moment. In the coming weeks, we hope to narrow things down through extensive logging and analysis.
Thanks Runa
On Fri, Oct 30, 2015 at 2:58 PM, Runa Bhattacharjee < rbhattacharjee@wikimedia.org> wrote:
The tool is in active development and despite the issues reported, we still think that the overall balance of the content contributed is still positive. From our observation, users translating content with Content Translation are normally editing the articles after creation to improve them, and the deletion ratio is much lower compared to that of articles created from scratch. The purpose of the tool is to facilitate the creation of those first versions to be later evolved. Disabling the tool would also impact the workflow of many users who chose to use Content Translation because it saves them time.
As it has become usual with tools developed by WMF (VE, Flow, ...), I think you disregard each time the damages made by these tools on wikis, preferring to deploy it in alpha/beta versions to a wide audience before making it stable. I clearly think it's a big error. "The tool is in active development" : I reported most of the problems in the early days of CX release, none of them seem to have been fixed, so clearly, damages is not among your primary concern, you prefer to deploy it widely rather than taking a more cautious approach by fixing the major bugs before expanding the audience.
Why some users now have a workflow with a really buggy tool ? Simply because of the approach you took... Given the number of articles I happen to fix myself, I can assure you that most of the CX users never fix the problems created by CX... So yes, it saves them time, at the expense of the time of other people who try to keep a clean encyclopedia. So, thanks for the extra work...
Nico
wikitech-l@lists.wikimedia.org