as some of you might know, I'm a software developer at Wikimedia
Deutschland, working on Wikidata. I'm currently focusing on improving
Wikidata's support for languages we as a team are not using on a daily
basis. As part of my work I stumbled over a shortcoming in MediaWiki's
message system that – as far as I see it – prevents me from doing the
right thing(tm). I'm asking you to verify that the issue I see indeed is
an issue and that we want to fix it. Subsequently, I'm interested in
hearing your plans or goals for MediaWiki's message system so that I can
align my implementation with them. Finally, I am hoping to find someone
who is willing to help me fix it.
== The issue ==
On Wikidata, we regularly have content in different languages on the
same page. We use the HTML lang and dir attributes accordingly. For
example, we have a table with terms for an entity in different
languages. For missing terms, we would display a message in the UI
language within this table. The corresponding HTML (simplified) might
look like this:
<div id="mw-content-text" lang="UILANG" dir="UILANG_DIR">
<tr class="entity-terms-for-OTHERLANG1" lang="OTHERLANG1"
<div class="wb-empty" lang="UILANG" dir="UILANG_DIR">
<!-- missing label message -->
This works great as long as the missing label message is available in
the UI language. If that is not the case, though, the message is
translated according to the defined language fallbacks. In that case, we
might end up with something like this:
<div class="wb-empty" lang="arc" dir="rtl">No label defined</div>
That's obviously wrong, and I'd like to fix it.
== Fixing it ==
For fixing this, I tried to make MessageCache provide the language a
message was taken from . That's not too straight-forward to begin
with, but while working on it I realized that MessageCache is only
responsible for following the language fallback chain for database
translations. For file-based translations, the fallbacks are directly
merged in by LocalisationCache, so the information is not there anymore
at the time of translating a message. I see some ways to fix this:
* Don't merge messages in LocalisationCache, but perform the fallback on
request (possibly caching the result)
* Tag message strings in LocalisationCache with the language they are in
(sounds expensive to me)
* Tag message strings as being a fallback in LocalisationCache (that way
we could follow the fallback until we find a language in which the
message string is not tagged as being a fallback)
What do you think?
Adrian Heine né Lang
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
Imagine a world, in which every single human being can freely share in
the sum of all
knowledge. That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
-------- Messaggio inoltrato --------
Oggetto: CLDR SurveyTool message from admin
Data: Wed, 4 May 2016 16:45:44 +0000 (UTC)
This message is being sent to you on behalf of admin" <admin@> (Survey
Tool) - user #1
SurveyTool Message ---
CLDR DATA SUBMISSION TO GO LIVE TODAY....
The CLDR TC is pleased to announce that we will begin live data
submission using the survey tool starting at 18:00 GMT today. The first
week or so we have designated as a "shakedown" period in order for us to
deal with any issues that come up as people begin to do data submission.
Please realize that once we flip the switch, then any votes cast will
count as real votes into the CLDR voting procedure.
If you have been designated as a "shakedown" vetter by your
organization, we want to encourage you to go about your normal data
submission work so that we can prioritize any issues that occur. For
the rest of the vetting community, you are welcome to go ahead and
start, but please be aware that you may encounter things that don"t yet
Any issues you find should be reported via a ticket to
Thanks for your participation in CLDR release 30...
Survey Tool: http://st.unicode.org/cldr-apps/survey