[ cross-posted to MediaWiki-i18n, Wikimedia-L and Wikitech-L ]
Dear Wikimedians,
The 2000th article that was written using the ContentTranslation extension was published today.
Article #2000 was translated from English to Greek, and it's about Škocjan Caves, a UNESCO World Heritage site in Slovenia.
Original: https://en.wikipedia.org/wiki/%C5%A0kocjan_Caves Translated: https://el.wikipedia.org/wiki/%CE%A3%CF%80%CE%AE%CE%BB%CE%B1%CE%B9%CE%B1_%CF...
In case you're wondering what ContentTranslation is, here's a brief summary: ContentTranslation is an extension that helps Wikipedia editors to create articles quickly and easily by translating them from other languages. It's being developed by the Language Engineering team. Its design started in the summer of 2013 and its coding started in early 2014. You can find more info at https://www.mediawiki.org/wiki/CX as well as in the following blog posts: * http://blog.wikimedia.org/2015/01/10/content-translation-beta-coming-soon/ * http://blog.wikimedia.org/2015/01/20/try-content-translation/ * http://blog.wikimedia.org/2015/04/06/content-translation-improved-my-edits/ * http://blog.wikimedia.org/2015/04/08/the-new-content-translation-tool/
Some more data about ContentTranslation: * Our first deployment was in mid-January to Catalan, Spanish, Portuguese, Esperanto, Norwegian Bokmal, Danish, Indonesian and Malay. Now we support 43 languages, and this number is growing every week as we extend the deployment (a special thank-you to the Ops and Release Engineering people, who continuously and tirelessly support our deployment effort). * In all the Wikipedias in which ContentTranslation is deployed, it is currently defined as a Beta feature, which means that it is only available to logged-in users who opted into it in the preferences. * The 1000th article was written on April 10th, so it took much less to get to 2000 than to 1000. * The language into which the most articles were translated is Catalan: 762. The Catalan Wikipedia community always had a strong inclination to translation, it was the first one that volunteered to test the tool in labs in the summer of 2014 and provided a lot of useful feedback, and it also has good machine translation support thanks to the Freely-licensed Apertium engine. * The second most popular target language is Spanish. It started slowly in the first couple of months, but it's quickly growing since March. * Other target languages that are quickly growing lately are French, Portuguese and Ukrainian. * The language from which the largest number of articles is translated is English. It is followed by Spanish, from which a lot of articles are translated to the closely related Portuguese and Catalan. * The total number of people who published at least one translated article into any language is 663. * Of more than 2000 articles that were created, about 60 were deleted, so we have a reason to think that the quality of the created articles is pretty OK. * In Catalan we see that ContentTranslation has some influence on the number of articles created per day - it was usually between 60 and 90 before 2015, and in January and February it was over a 100. It's too early to say how does it influence other languages, but we are optimistic ;) * A community discussion about enabling the tool in the French Wikipedia ended with 50 "votes" in support of the tool and 0 "votes" against it ;)
Some of our plans for the coming months are: * Enabling more languages, including big ones like English, Russian and Italian, as well as right-to-left languages. * Improving the support for links. * Creating support for smart suggestions of articles to translate, as well as "task lists" for translation projects. * Starting to get the tool out of beta status :)
I'd like to thank all the Wikimedia volunteers around the planet who are participating in this effort by translating articles, translating the extension's user interface, testing the tool, assisting other wikipedians to translate, organizing translation workshops, reporting useful bugs, submitting patches, and generally proving day after day what an incredible community they are - hard-working, massively-multilingual, helpful, patient, creative and talented.
Thank you - we have a lot more to achieve together \o/
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
-Wikimedia-l
Amir, this is awesome. Glad to see it's taking off.
On Thu, Apr 30, 2015 at 6:24 AM Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
- In all the Wikipedias in which ContentTranslation is deployed, it is
currently defined as a Beta feature, which means that it is only available to logged-in users who opted into it in the preferences.
Regarding Beta feature status: what would it take to enable this as a default? You mentioned this in plans for the coming months.
That deletion rate (60 out of 2000 = 3%?) looks actually a lot better than "pretty OK". According to historical stats, it's basically equivalent to deletion rates for article creators with more than a month of experience.[1]
It seems like the only risk in taking this out of beta status as moving to a default is UI clutter for monolingual users who can't ever make use of the feature? Maybe it's unobtrusive enough that you don't need to do this, but perhaps you could enable as a default for only those users who have substantively edited more than one language edition of Wikipedia? Either that, or we could consider adding a "languages I edit in" section to the Internationalisation section of user preferences?
I'm sure you've thought about this before, but I'd love to hear more about the rollout plan.
1. http://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation
Amir,
First of all a big thank you as a speaker of Catalan and fervent advocate of minoritary languages.
OTOH, I would like to bring awareness to the topic of translation engine. Apertium is no longer supported as a GsoC project, I guess the project will keep alive but it worries me that downstream we reap the benefits without considering that the upstream projects might need support too.
I wish that the conversation thread that was started long ago to support upstream projects also includes now open sourced translation tools because as your number show, they seem very relevant.
Best regards, Micru
On Thu, Apr 30, 2015 at 8:04 PM, Steven Walling steven.walling@gmail.com wrote:
-Wikimedia-l
Amir, this is awesome. Glad to see it's taking off.
On Thu, Apr 30, 2015 at 6:24 AM Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
- In all the Wikipedias in which ContentTranslation is deployed, it is
currently defined as a Beta feature, which means that it is only
available
to logged-in users who opted into it in the preferences.
Regarding Beta feature status: what would it take to enable this as a default? You mentioned this in plans for the coming months.
That deletion rate (60 out of 2000 = 3%?) looks actually a lot better than "pretty OK". According to historical stats, it's basically equivalent to deletion rates for article creators with more than a month of experience.[1]
It seems like the only risk in taking this out of beta status as moving to a default is UI clutter for monolingual users who can't ever make use of the feature? Maybe it's unobtrusive enough that you don't need to do this, but perhaps you could enable as a default for only those users who have substantively edited more than one language edition of Wikipedia? Either that, or we could consider adding a "languages I edit in" section to the Internationalisation section of user preferences?
I'm sure you've thought about this before, but I'd love to hear more about the rollout plan.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hoi, What does it take for the WMF to provide continued support for important tools like Apertium ? If there is one movement that will benefit from development of Apertium it is ours.. Thanks, GerardM
On 1 May 2015 at 15:57, David Cuenca dacuetu@gmail.com wrote:
Amir,
First of all a big thank you as a speaker of Catalan and fervent advocate of minoritary languages.
OTOH, I would like to bring awareness to the topic of translation engine. Apertium is no longer supported as a GsoC project, I guess the project will keep alive but it worries me that downstream we reap the benefits without considering that the upstream projects might need support too.
I wish that the conversation thread that was started long ago to support upstream projects also includes now open sourced translation tools because as your number show, they seem very relevant.
Best regards, Micru
On Thu, Apr 30, 2015 at 8:04 PM, Steven Walling steven.walling@gmail.com wrote:
-Wikimedia-l
Amir, this is awesome. Glad to see it's taking off.
On Thu, Apr 30, 2015 at 6:24 AM Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
- In all the Wikipedias in which ContentTranslation is deployed, it is
currently defined as a Beta feature, which means that it is only
available
to logged-in users who opted into it in the preferences.
Regarding Beta feature status: what would it take to enable this as a default? You mentioned this in plans for the coming months.
That deletion rate (60 out of 2000 = 3%?) looks actually a lot better
than
"pretty OK". According to historical stats, it's basically equivalent to deletion rates for article creators with more than a month of experience.[1]
It seems like the only risk in taking this out of beta status as moving
to
a default is UI clutter for monolingual users who can't ever make use of the feature? Maybe it's unobtrusive enough that you don't need to do
this,
but perhaps you could enable as a default for only those users who have substantively edited more than one language edition of Wikipedia? Either that, or we could consider adding a "languages I edit in" section to the Internationalisation section of user preferences?
I'm sure you've thought about this before, but I'd love to hear more
about
the rollout plan.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Etiamsi omnes, ego non _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Fri, May 1, 2015 at 9:57 AM, David Cuenca dacuetu@gmail.com wrote:
It worries me that downstream we reap the benefits without considering that
the upstream projects might need support too.
I wish that the conversation thread that was started long ago to support upstream projects also includes now open sourced translation tools
Bumped by the recent discussion of ContentTranslation: Is there an overview / umbrella for current "upstream support" issues"? (ditto for downstream support, for that matter?)
SJ
On Sat, Jun 6, 2015 at 10:08 AM, Samuel Klein meta.sj@gmail.com wrote:
On Fri, May 1, 2015 at 9:57 AM, David Cuenca dacuetu@gmail.com wrote:
It worries me that downstream we reap the benefits without considering that
the upstream projects might need support too.
I wish that the conversation thread that was started long ago to support upstream projects also includes now open sourced translation tools
Bumped by the recent discussion of ContentTranslation: Is there an overview / umbrella for current "upstream support" issues"? (ditto for downstream support, for that matter?)
AFAIK, the main lists are https://www.mediawiki.org/wiki/Upstream_projects https://www.mediawiki.org/wiki/Developers/Maintainers https://wikitech.wikimedia.org/wiki/Key_Wikimedia_software_projects
If I understand the question correctly, it's a matter of support that we should and can to our machine translation providers. Currently there's only - Apertium, but there may be more in the future.
The Language engineering team members met Apertium developers in a machine translation conference in Turkey recently. We discussed what will be the most useful thing for them, and clearly it is the resolution of https://phabricator.wikimedia.org/T95886 . This should happen very soon in any case. Some other ideas are around https://phabricator.wikimedia.org/T91748 .
If I'm allowed to dream for a moment, then if we're working with a Free Machine Translation provider, then we could send the MT engine developers feedback about the translations continuously, so that they would release new versions of the engine *daily* rather than every few months as it happens with Apertium now. It's technically conceivable, but will require a bit of engineering work from both sides.
And of course there's the elephant in the room, which is developing machine translation for more languages than what Apertium and other engines support today. It's a bit of a circular argument, but one of the best things to do to that end is simply to translate a lot of articles manually, and provide MT developers with parallel texts, the most important resource for MT engine development - and this really goes back to the aforementioned https://phabricator.wikimedia.org/T95886 .
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
2015-06-06 20:08 GMT+03:00 Samuel Klein meta.sj@gmail.com:
On Fri, May 1, 2015 at 9:57 AM, David Cuenca dacuetu@gmail.com wrote:
It worries me that downstream we reap the benefits without considering
that the upstream projects might need support too.
I wish that the conversation thread that was started long ago to support upstream projects also includes now open sourced translation tools
Bumped by the recent discussion of ContentTranslation: Is there an overview / umbrella for current "upstream support" issues"? (ditto for downstream support, for that matter?)
SJ
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
wikitech-l@lists.wikimedia.org