[ cross-posted to MediaWiki-i18n, Wikimedia-L and Wikitech-L ]
The 2000th article that was written using the ContentTranslation extension
was published today.
Article #2000 was translated from English to Greek, and it's about Škocjan
Caves, a UNESCO World Heritage site in Slovenia.
In case you're wondering what ContentTranslation is, here's a brief
summary: ContentTranslation is an extension that helps Wikipedia editors to
create articles quickly and easily by translating them from other
languages. It's being developed by the Language Engineering team. Its
design started in the summer of 2013 and its coding started in early 2014.
You can find more info at https://www.mediawiki.org/wiki/CX as well as in
the following blog posts:
Some more data about ContentTranslation:
* Our first deployment was in mid-January to Catalan, Spanish, Portuguese,
Esperanto, Norwegian Bokmal, Danish, Indonesian and Malay. Now we support
43 languages, and this number is growing every week as we extend the
deployment (a special thank-you to the Ops and Release Engineering people,
who continuously and tirelessly support our deployment effort).
* In all the Wikipedias in which ContentTranslation is deployed, it is
currently defined as a Beta feature, which means that it is only available
to logged-in users who opted into it in the preferences.
* The 1000th article was written on April 10th, so it took much less to get
to 2000 than to 1000.
* The language into which the most articles were translated is Catalan:
762. The Catalan Wikipedia community always had a strong inclination to
translation, it was the first one that volunteered to test the tool in labs
in the summer of 2014 and provided a lot of useful feedback, and it also
has good machine translation support thanks to the Freely-licensed Apertium
* The second most popular target language is Spanish. It started slowly in
the first couple of months, but it's quickly growing since March.
* Other target languages that are quickly growing lately are French,
Portuguese and Ukrainian.
* The language from which the largest number of articles is translated is
English. It is followed by Spanish, from which a lot of articles are
translated to the closely related Portuguese and Catalan.
* The total number of people who published at least one translated article
into any language is 663.
* Of more than 2000 articles that were created, about 60 were deleted, so
we have a reason to think that the quality of the created articles is
* In Catalan we see that ContentTranslation has some influence on the
number of articles created per day - it was usually between 60 and 90
before 2015, and in January and February it was over a 100. It's too early
to say how does it influence other languages, but we are optimistic ;)
* A community discussion about enabling the tool in the French Wikipedia
ended with 50 "votes" in support of the tool and 0 "votes" against it ;)
Some of our plans for the coming months are:
* Enabling more languages, including big ones like English, Russian and
Italian, as well as right-to-left languages.
* Improving the support for links.
* Creating support for smart suggestions of articles to translate, as well
as "task lists" for translation projects.
* Starting to get the tool out of beta status :)
I'd like to thank all the Wikimedia volunteers around the planet who are
participating in this effort by translating articles, translating the
extension's user interface, testing the tool, assisting other wikipedians
to translate, organizing translation workshops, reporting useful bugs,
submitting patches, and generally proving day after day what an incredible
community they are - hard-working, massively-multilingual, helpful,
patient, creative and talented.
Thank you - we have a lot more to achieve together \o/
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
“We're living in pieces,
I want to live in peace.” – T. Moore
CLDR 28 translations seem to be going well: some wikimedians already
submitted more than 1000 translations (and one more than 5000...).
Please join: http://st.unicode.org/cldr-apps/v
See the instructions to get an account and the screenshots/tutorial for
, I'm wondering what items to focus our translation on and how to reach
translators for them. Most reports from users are for the MediaWiki
timestamps and for missing names of the languages they care. Where are
they and how to track progress?
are a bit too generic for us, comprehensive language names are not
covered by any column. Does "minimal" cover what we need in most cases?
«Contains names for the languages, scripts, and territories associated
with the language, numbering systems used in those languages, date and
The stats for "minimal" and "posix" coverage level for the MediaWiki
locales in CLDR 27, excluding those over 95 %, are as follows:
Code Native Name Minimal% Posix%
uz_Cyrl Ўзбек (Кирил) 95% 96%
ksh Kölsch 94% 100%
ks کٲشُر 91% 96%
fo føroyskt 91% 98%
fur furlan 91% 98%
bs_Cyrl босански (Ћирилица) 91% 98%
gsw Schwiizertüütsch 90% 96%
rm rumantsch 89% 95%
be беларуская 89% 98%
mt Malti 88% 95%
chr ᏣᎳᎩ 87% 98%
se davvisámegiella 82% 87%
or ଓଡ଼ିଆ 81% 98%
ln lingála 81% 98%
nn nynorsk 81% 98%
sg Sängö 80% 96%
br brezhoneg 80% 87%
ses Koyraboro senni 79% 94%
ha Hausa 79% 98%
kab Taqbaylit 79% 96%
ff Pulaar 79% 96%
sah саха тыла 79% 95%
ki Gikuyu 78% 92%
tzm Tamaziɣt 78% 96%
om Oromoo 78% 87%
bm bamanakan 77% 92%
lg Luganda 77% 92%
sn chiShona 77% 92%
mg Malagasy 77% 92%
ti ትግርኛ 74% 89%
shi ⵜⴰⵎⴰⵣⵉⵖⵜ 69% 94%
so Soomaali 67% 92%
ak Akan 65% 96%
bo བོད་སྐད་ 62% 84%
qu Runasimi 61% 87%
az_Cyrl Азәрбајҹан (Кирил) 61% 91%
rn Ikirundi 59% 96%
yo Èdè Yorùbá 58% 92%
ps پښتو 57% 91%
ig Igbo 56% 91%
rw Kinyarwanda 54% 91%
eo esperanto 54% 87%
haw ʻŌlelo Hawaiʻi 53% 87%
yi ייִדיש 53% 87%
pa_Arab پنجابی (عربی) 52% 91%
as অসমীয়া 49% 78%
kok कोंकणी 49% 78%
ii ꆈꌠꉙ 48% 58%
gv Gaelg 47% 74%
kw kernewek 47% 74%
kl kalaallisut 46% 79%
uz_Arab اوزبیک (عربی) 45% 80%
A quick reminder about Wikimedia Language Engineering team's IRC office
hour later today at 1430 UTC on #wikimedia-office. Please see below for
the original announcement, local time, and agenda. We will post logs on
metawiki after the event.
---------- Forwarded message ----------
From: Runa Bhattacharjee <rbhattacharjee(a)wikimedia.org>
Date: Thu, Apr 30, 2015 at 7:29 PM
Subject: [x-post] Next Language Engineering IRC Office Hour is on 5th May
2015 (Tuesday) at 1430 UTC
To: MediaWiki internationalisation <mediawiki-i18n(a)lists.wikimedia.org>,
Wikimedia developers <wikitech-l(a)lists.wikimedia.org>, Wikimedia Mailing
List <wikimedia-l(a)lists.wikimedia.org>, "Wikimedia & GLAM collaboration
The next IRC office hour of the Language Engineering team of the Wikimedia
Foundation will be on May 5, 2015 (Tuesday) at 1430 UTC on
#wikimedia-office. We missed a few of our regular monthly office hours, but
from May onwards we will be back on schedule.
There has been significant progress around Content Translation and it is
now available as a beta feature on several Wikipedias. We’d love to hear
comments, suggestions and any feedback that will help us make this tool
Please see below to check local time and event details. Questions can also
be sent to me ahead of the event.
Monthly IRC Office Hour:
# Date: May 5, 2015 (Tuesday)
# Time: 1430 UTC (Check local time:
# IRC channel: #wikimedia-office
Language Engineering - Outreach and QA Coordinator
Language Engineering - Outreach and QA Coordinator