as some of you might know, I'm a software developer at Wikimedia
Deutschland, working on Wikidata. I'm currently focusing on improving
Wikidata's support for languages we as a team are not using on a daily
basis. As part of my work I stumbled over a shortcoming in MediaWiki's
message system that – as far as I see it – prevents me from doing the
right thing(tm). I'm asking you to verify that the issue I see indeed is
an issue and that we want to fix it. Subsequently, I'm interested in
hearing your plans or goals for MediaWiki's message system so that I can
align my implementation with them. Finally, I am hoping to find someone
who is willing to help me fix it.
== The issue ==
On Wikidata, we regularly have content in different languages on the
same page. We use the HTML lang and dir attributes accordingly. For
example, we have a table with terms for an entity in different
languages. For missing terms, we would display a message in the UI
language within this table. The corresponding HTML (simplified) might
look like this:
<div id="mw-content-text" lang="UILANG" dir="UILANG_DIR">
<tr class="entity-terms-for-OTHERLANG1" lang="OTHERLANG1"
<div class="wb-empty" lang="UILANG" dir="UILANG_DIR">
<!-- missing label message -->
This works great as long as the missing label message is available in
the UI language. If that is not the case, though, the message is
translated according to the defined language fallbacks. In that case, we
might end up with something like this:
<div class="wb-empty" lang="arc" dir="rtl">No label defined</div>
That's obviously wrong, and I'd like to fix it.
== Fixing it ==
For fixing this, I tried to make MessageCache provide the language a
message was taken from . That's not too straight-forward to begin
with, but while working on it I realized that MessageCache is only
responsible for following the language fallback chain for database
translations. For file-based translations, the fallbacks are directly
merged in by LocalisationCache, so the information is not there anymore
at the time of translating a message. I see some ways to fix this:
* Don't merge messages in LocalisationCache, but perform the fallback on
request (possibly caching the result)
* Tag message strings in LocalisationCache with the language they are in
(sounds expensive to me)
* Tag message strings as being a fallback in LocalisationCache (that way
we could follow the fallback until we find a language in which the
message string is not tagged as being a fallback)
What do you think?
Adrian Heine né Lang
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
Imagine a world, in which every single human being can freely share in
the sum of all
knowledge. That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
The Wikimedia Language team has been assembling monthly reports about
language support activities for one year. You can read the latest
Highlights for May include: Special:Translate got an edit summary
field and modernization of web font formats: woff2 is in, eot is out.
Due to the nature of our work, the Language team  (Amir, Kartik,
Pau, Runa, Santhosh, and myself) alone cannot adequately support all
the languages of the Wikimedia movement. That is why the report
includes work by volunteers. We have bolded the names who we believe
are contributing as volunteers.
This report focuses on technical activities. You wont find future
plans or high level roadmap items on it. There is currently a major
omission: the i18n work of MediaWiki core itself. That is lacking
because it is more difficult to filter those activities and also
because we have not had much time for MediaWiki core i18n work.
To acknowledge the work of volunteers and to support them better, the
Language team released a statement of intent for code review  about
six months ago. To summarize: we attempt to review patches not by us
within a week, and patches stalled due to no updates after review for
three months will be abandoned -- unless we feel they are worth fixing
When we released the statement, we also agreed to reduce the existing
backlog of open patches. The results so far are positive, even though
it is easy to find examples where we have not been able to follow our
intent. Translate extension had 35 open patches when we started in
February -- at end of May it had only 12 open patches . Universal
Language Selector had gone from 10 to 6, and fewer of them unreviewed.
Content Translation had gone from 15 to zero. Our jquery repositories
in GitHub have not fared as well, but we hope to achieve similar
results there in the future.
We excluded many repositories from the statement of intent in the fear
that we would add too much of a burden to ourselves. To our delight,
except MediaWiki core i18n, all those repositories have had swift
reviews and I count only two open patches in them.
- Niklas (on behalf of the Language team)
 The numbers change constantly. As of 2016-06-17 Translate has 23
open patches, but only 10 of them not from our team. Universal
Language Selector has 13 patches, 5 of them not from our team. Content
Translation currently has 6, one of them not from our team.