Hi Amir,
thanks for the bug report! We will go and implement it along the lines
you suggest it.
I have one question still, maybe you can help me:
About 10% of the titles in the Hebrew Wikipedia are in latin alphabet
(very rough estimate, may be completely off, based of a glance on
Special:Allpages). So an article like
http://he.wikipedia.org/wiki/Yesterday, where the title would be LTR,
would be declared as RTL. Is there a way to avoid that?
I guess the answer is no, but I wanted to ask.
Cheers,
Denny
2012/8/11 Amir E. Aharoni <amir.aharoni(a)mail.huji.ac.il>il>:
Hallo,
It's my first email on this list, so in case you don't know me: I am
Amir, I'm from Israel, I'm a wikipedian since 2004, I write mostly in
Hebrew and English, I care strongly about language issues in software
in general and about right-to-left support in particular, and I work
in the WMF's localization team.
Now, about the subject: you probably know that i18n is
"internationalization" and "l10n" is "localization".
"m17n" is a less
common term, which means "multilingualization" - making software able
to work in many languages at once. This email is about one of the
easiest and the most important ways to make Wikidata support many
languages on one page everywhere.
I've been testing the Wikidata demo for a few days now, with the aim
of getting it deployed in the Hebrew Wikipedia very soon. The first
thing that I noticed is that even though everybody understands that
Wikidata is supposed to be massively multilingual, little or no use is
made of the lang and dir attributes in the HTML that Wikidata
generates. The most immediate example is
http://wikidata-test-repo.wikimedia.de/wiki/Data:Q2?uselang=en
It basically lists the word "Helium" in many languages, but as far as
the browser is concerned, almost all of it is written in English,
because the root <html> element says lang="en". The only exceptions
are the interlanguage links in the sidebar, where the lang attributes
are user properly, but that's a regular MediaWiki feature.
It is very much needed to explicitly specify the lang attribute and
also the dir attribute (direction: "ltr" or "rtl") on every element,
the content language of which is known to be different from the
content language of the enclosing element. Many developers may think
that this attribute doesn't do anything, but actually it does a lot:
* correct text-to-speech and speech-to-text handling
* correct font rendering (relevant for Serbian [1], for some languages
of India etc.)
* selecting the correct spell checking dictionary
* selecting the right language for machine translation
* adjusting the line-height
* selecting the web font (in MediaWiki's WebFonts extension)
* etc.
So please, use it whenever you can.
Always use the dir attribute in these circumstances, too. It must be
specified explicitly even though "ltr" is the default, because if the
user interface is right-to-left, it will propagate to elements in
other languages, too, so you would right-to-left English. (I consider
this a bug in the HTML standard... but it's a topic for a different
email).
In the case of the page that I mentioned above, it should be quite
trivial to fix, because MediaWiki's Language class provides very easy
functions for this. I also opened bug 39257 [2] about it. I am
repeating it here on the mailing list, just to say to the developers
to do it everywhere. If you are a developer and you run into any
problems with using these attributes, please contact in any way that
is convenient to you.
Thank you!
[1] See
https://sr.wikipedia.org/wiki/User:Amire80
[2]
https://bugzilla.wikimedia.org/show_bug.cgi?id=39257
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 |
http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.