Hi list,
We all know that multilingual program interfaces and mutilingual document presentation is a very difficult task, so please take the following points as constructive critics and not as a rant against your work (and yes it is worth reading this veeery long text ;-).
Current state: So far Mediawiki has achieved quite a lot in order to support monolingual wikis in any native language and any character set. As long as your wiki just settles around content in one single language and as long as all users just use this language as interface language Mediawiki does a great job. However when a wiki contains content in many languages and if users are using many different languages for their interface settings (like Wikimedia Commons, Meta Wiki and some third party wikis, including one created by myself) things get very problematic.
Missing tests: In the past MediaWiki was changed several times without checking back wether the change affects language presentation in multilingual wikis: * One sudden change for example was that "MediaWiki:Monobook.js" stopped from being localisable via MediaWiki:Monobook.js/$ISO-Language-code. This old behaviour was used in every wiki for localising the tooltips. Additionally in Meta and Wikimedia Commons it was used for localisation of some integral global javascript routines. After noticing that at least Wikimedia Commons people did fix that themselves inside the wiki (see below "Fix for i18n localization not loading." at http://commons.wikimedia.org/wiki/MediaWiki:Monobook.js). However in the end the tooltip issue was solved using localisable MediaWiki:Tooltip-$name pages and MediaWiki:Monobook.js is declared deprecated instead of MediaWiki:Common.js (which is a good thing). So this issue is finally past. * A recent change was a MediaWiki namespace cleanup after running maintenace/update.php. Every message previously automatically copied into the MediaWiki namespace by the same script was removed. This is in principle a good thing as this allows for some other fixes (for example in mutilingual wikis) and reduces problems with hidden message strings. However one important thing was forgotten. The interface points to several internal pages (for example the pages linked by MediaWiki:Sidebar). As default these interface links are not localisable. You have to explecitly whiltelist link target i18n for every affected page in LocalSettings.php using "$wgForceUIMsgAsContentMsg = array();" (even this switch alone is a maintenance problem for multilingual wikis achieving true multilinguality). The old maintenance/update.php now did a good thing: For every whitelisted link target it did copy the default language link target into the language sub pages in every language if there wasn't one created by hand before (for example the content of MediaWiki:Mainpage was copied into MediaWiki:Mainpage/de if it was empty). So if there was the link target existing in the default language it was not possible to link people in other languages into non-existant "localised" pages after running maintenance/update.php. However now this workaroud solution doesn't work anylonger and now people get linked into the nowhere. How to solve that (I don't want the old copy behaviour back as this is a hakish solution)? Change at least for link target strings (all that now need whitelisting) the interface string resolution order from Mediawiki:$Message/$language-code -> UI-String-File-$Message to Mediawiki:$Message/$language-code -> Mediawiki:$Message (default language). That way people don't get linked into nowhere on whiletlisted link targets - and of course you also could entirely remove the need for $wgForceUIMsgAsContentMsg switch in Mediawiki (and you would get rid of one Wikimedia server maintence issue that fills your Bugzilla).
So in future please consider as well implications of message handling changes towards multilingual wikis. At least two important wikis of Wikimedia are multilingual and deserve some thoughts.
Skin design error: The Monobook skin hardcodes first letter lower case for certain UI-messages (CSS: .portlet h5, .portlet h6). This is a totally bad idea considering languages that use upper case a lot in order to differentiate different words and meanings. German is probably the most common example for such a language. As long as you use a monolingual (German) wiki a hackish solution works; see MediaWiki:Monobook.css/de in your wiki of choice for the CSS code. As CSS files can't be (and shouldn't need to be) localisable using the sub page style the solution doesn't work in mutilingual wikis as now people of all other languages complain that their interface strings are upper case, since it seems to be "good style" starting every interface string affected by the ".portlet h5, .portlet h6"-rule with upper case and later forcing it again to lower case. So the solution is: Drop the forced lower case in Monobook and if you don't like your UI strings starting with upper case in your language of choice make these strings lower case right from the start in vanilla MediaWiki. There is no other good solution.
Interface string problems: * Above interface strings were already covered a bit. There is a second problem with link targets pointing into nowhere. Vanilla Mediawiki strings contain hardcoded localised link targets. For example have a look at MediaWiki:Blockedtext (if you want to get all (?) affected messages grep for {{ns:project}}). Well again in monolingual wikis no problem. These linked pages are supposed to exist but now consider a multilingual wiki... People using $non-default-language get pointed into nowhere and as we have quite a lot languages supported in Mediawiki this means a lot of pages you'd need to create in previous or if you don't wat that touching a huge number of interface strings. So in vanilla Mediawiki please do not hardcode any wiki page in message strings. Well how to localise embedded link targets in the future? Use a mechanism that points to link target defining mediawiki pages (like the mediawiki namespace pages defined by MediaWiki:Sidebar). A possible syntax could be done with "{{subst:Mediawiki:PAGENAME}}" templates (however this syntax has some problems as this would need to expand to a given sub page and would need to check if that sub page exists and fall back to default if not). Another possible syntax could be the use of named variables such as $MediaWiki:PAGENAME in interface strings. * Furthermore there is a big inherent communication problem with interface related changes in mutilingual wikis. If you changed an let us say legal message string in the default language people using another language won't notice it. Currently you'd need to overwrite every language-code sub page by hand in order to make people aware of the change (more than 80 edits needed for just one single message in order to cover every supported language). This specific problem has been covered as well by http://bugzilla.wikimedia.org/show_bug.cgi?id=8188 and is a severe showstopper to the success of projects like Wikimedia Commons. Several solution ideas and solution-side-effect problems have been discussed in this bug (my proposed solution with the changed message string resolution order would work now much better, as default messages are now deleted in mediawiki namespace).
Templates/embedded text parts: * Templates need to be called with their exact name. So if you are using common templates in multilingual wikis you either have them only in the wiki default language or you create a template that contains the text let us say repeated 40 times in every language (you now know why multilingual wikis look ugly). Template-i18n doesn't work like i18n of message strings with language code sub pages as teplates can contan variables and people would localise the variable names in the translated templates as well and you'd get a hell of inconsitencies. Currently peple are using a very very much fragile Javascript hack to "hide" not wanted text parts. A better and very good working (TM) solution is an extension using an xml-like-tag called <Multilang> which embedds localised strings in the same template/page. See http://www.mediawiki.org/wiki/Extension:Multilang for code details and http://bugzilla.wikimedia.org/show_bug.cgi?id=8287 for the related bug entry in bugzilla. This solution does reduce ugliness of multilingual wikis a lot and will increase a lot percieved true multilinguality.
Default language: * Currently anon people can only use the wiki default language. For multilingual wikis this is a great shortcmmin for the percieved multilinguality as people stil say "I don't like it is is english by default" even if everything is existing in their own language as well. This leads to very ugly "?uselang=" URL hacks that lead to interface flickering (and render serverside caching useless anyways), see for example http://commons.wikimedia.org/wiki/Accueil (the "Interface en français" link). And the language switches back after you clicked at a link. There exists since some time a patch that would give anon users their language of choice: http://bugzilla.wikimedia.org/show_bug.cgi?id=3665 (note that caching gets currently gets affected more and more by these nasty "uselang" tricks; though I agree with the comments in the bug that a drop down selection would be better than reyling on browser language, although it would require a cookie too).
Summary: All these problems are currently big problems for multilingual wikis but can be solved in Mediawiki without need for some revolutionary magic code mix-up, just step by step with fairly small code changes. Fixig these issues would be a great leap forward for multilingual wikis.
You'd help some struggeling Wikimedia wikis a lot.
Cheers, Arnomane
First of all, in general, there will always be a risk of breaking unusual stuff with our development model, because the testing phase depends on developers noticing stuff in their local copies. This not only is kind of shallow, it also totally ignores some options that might be very important to some wikis. Particularly relevant to this is that I assume every developer's local wiki is in their preferred language, and since they're the only user, it seems likely that they wouldn't have a different language from the default. Maybe someone runs a wiki with the content language different from their user language, but I at least don't (right now!).
A few things I'd like to comment on.
On 1/30/07, Daniel Arnold arnomane@gmx.de wrote:
The Monobook skin hardcodes first letter lower case for certain UI-messages (CSS: .portlet h5, .portlet h6). This is a totally bad idea considering languages that use upper case a lot in order to differentiate different words and meanings. German is probably the most common example for such a language. As long as you use a monolingual (German) wiki a hackish solution works; see MediaWiki:Monobook.css/de in your wiki of choice for the CSS code. As CSS files can't be (and shouldn't need to be) localisable using the sub page style the solution doesn't work in mutilingual wikis as now people of all other languages complain that their interface strings are upper case, since it seems to be "good style" starting every interface string affected by the ".portlet h5, .portlet h6"-rule with upper case and later forcing it again to lower case. So the solution is: Drop the forced lower case in Monobook and if you don't like your UI strings starting with upper case in your language of choice make these strings lower case right from the start in vanilla MediaWiki. There is no other good solution.
Perhaps. The problem is that your solution affects other skins as well. It would be possible to add classes for the content and interface languages to body, but that won't necessarily work if some of the relevant messages are inheriting from some other language. I'm not sure how to best deal with this.
Interface string problems:
- Above interface strings were already covered a bit. There is a second
problem with link targets pointing into nowhere. Vanilla Mediawiki strings contain hardcoded localised link targets. For example have a look at MediaWiki:Blockedtext (if you want to get all (?) affected messages grep for {{ns:project}}). Well again in monolingual wikis no problem. These linked pages are supposed to exist but now consider a multilingual wiki... People using $non-default-language get pointed into nowhere and as we have quite a lot languages supported in Mediawiki this means a lot of pages you'd need to create in previous or if you don't wat that touching a huge number of interface strings. So in vanilla Mediawiki please do not hardcode any wiki page in message strings. . . .
Note that English-language special page names will work on wikis in any language. However, that's probably bad to rely upon from an interface perspective, because some poor Chinese user or whatnot will see a bunch of Latin gibberish as the link target, I guess. Perhaps it would be ideal if all special-page names were universal, assuming that causes no conflicts. Then everyone could see the special page names in their own language in the interface.
- Furthermore there is a big inherent communication problem with interface
related changes in mutilingual wikis. If you changed an let us say legal message string in the default language people using another language won't notice it. Currently you'd need to overwrite every language-code sub page by hand in order to make people aware of the change (more than 80 edits needed for just one single message in order to cover every supported language). This specific problem has been covered as well by http://bugzilla.wikimedia.org/show_bug.cgi?id=8188 and is a severe showstopper to the success of projects like Wikimedia Commons. Several solution ideas and solution-side-effect problems have been discussed in this bug (my proposed solution with the changed message string resolution order would work now much better, as default messages are now deleted in mediawiki namespace).
I've already commented there, of course.
All these problems are currently big problems for multilingual wikis but can be solved in Mediawiki without need for some revolutionary magic code mix-up, just step by step with fairly small code changes. Fixig these issues would be a great leap forward for multilingual wikis.
You'd help some struggeling Wikimedia wikis a lot.
I agree, most of these wouldn't be terribly difficult to do and would be a big help. Maybe I'll keep them (at least the easier ones :) ) in mind for some rainy day in the future.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Simetrical wrote:
First of all, in general, there will always be a risk of breaking unusual stuff with our development model, because the testing phase depends on developers noticing stuff in their local copies. This not only is kind of shallow, it also totally ignores some options that might be very important to some wikis.
Note that we can't really test things that *we don't know exist*; when people rely on unintended interactions and side-effects they may break. ;)
- -- brion vibber (brion @ pobox.com / brion @ wikimedia.org)
On Tuesday 30 January 2007 20:08:41 Brion Vibber wrote:
[MediaWiki:Monobook.js/langcode]
Note that we can't really test things that *we don't know exist*; when people rely on unintended interactions and side-effects they may break. ;)
Well you can imagine that from a user perspective this wasn't a unwanted side effect. Just a logical generalisation of the existing i18n.
Anyways this specific issue is solved and was just an example from user perspective in order to highlight a more general problem, which I also did point out with my other example (the update.php) there are things that currently don't get considered enough during development. I don't want to blame anyone for that. I myself do forget to consider things until people say "hey did you consider this as well...?".
Anyways we may close this thread and I will (and partly already did) post the single issues one after the other in individual threads.
Arnomane
On Tuesday 30 January 2007 19:59:07 Simetrical wrote:
First of all, in general, there will always be a risk of breaking unusual stuff with our development model, because the testing phase depends on developers noticing stuff in their local copies. This not only is kind of shallow, it also totally ignores some options that might be very important to some wikis. Particularly relevant to this is that I assume every developer's local wiki is in their preferred language, and since they're the only user, it seems likely that they wouldn't have a different language from the default. Maybe someone runs a wiki with the content language different from their user language, but I at least don't (right now!).
I admit that this is probably never perfectly solvable given the fact that Wikimedia runs over 700 wikis. However with respect to language there are monolingual and multilingual wikis. Currently the multlingual wiki use case could need some more attention and I think this is easily possible even without beeing deeply involved into let us say Wikimedia Commons issues just by playing around in your own little test wiki.
[...] solution is: Drop the forced lower case in Monobook and if you don't like your UI strings starting with upper case in your language of choice make these strings lower case right from the start in vanilla MediaWiki. There is no other good solution.
Perhaps. The problem is that your solution affects other skins as well. It would be possible to add classes for the content and interface languages to body, but that won't necessarily work if some of the relevant messages are inheriting from some other language. I'm not sure how to best deal with this.
I'd say it is bad in any skin to rely on global lower case/upper case behaviour as this is not language dependent. And of course Monobook is the default for the large majority and it is a bit strange forcing the majority into non-standard language display because of that. Several times I got rants by people, why "Commons is so English cause it even forces German to lowercase" and that I as an admin (you know admins "can do everything") shall fix this first or people proceed ignoring Commons. Yea it is downright silly but true.
So if someone things that some (older) non-default skins shall rely on upper case, please insert in these skins a force-upper-case CSS definition. At least it would hurt way less people.
Note that English-language special page names will work on wikis in any language. However, that's probably bad to rely upon from an interface perspective, because some poor Chinese user or whatnot will see a bunch of Latin gibberish as the link target, I guess. Perhaps it would be ideal if all special-page names were universal, assuming that causes no conflicts. Then everyone could see the special page names in their own language in the interface.
I am not talking about special pages. These are fine with me (and I know that namespace names are reachable in every place with their english name as well). It is about linking pages like "Project:Administrateur" in MediaWiki:Blockedtext/fr from vanilla Mediawiki. This was truely done only with monolingual wikis in mind and never considering simultanous usage of different languages.
Arnomane
Simetrical wrote:
Interface string problems:
- Above interface strings were already covered a bit. There is a second
problem with link targets pointing into nowhere. Vanilla Mediawiki strings contain hardcoded localised link targets. For example have a look at MediaWiki:Blockedtext (if you want to get all (?) affected messages grep for {{ns:project}}). Well again in monolingual wikis no problem. These linked pages are supposed to exist but now consider a multilingual wiki... People using $non-default-language get pointed into nowhere and as we have quite a lot languages supported in Mediawiki this means a lot of pages you'd need to create in previous or if you don't wat that touching a huge number of interface strings. So in vanilla Mediawiki please do not hardcode any wiki page in message strings. . . .
Note that English-language special page names will work on wikis in any language. However, that's probably bad to rely upon from an interface perspective, because some poor Chinese user or whatnot will see a bunch of Latin gibberish as the link target, I guess. Perhaps it would be ideal if all special-page names were universal, assuming that causes no conflicts. Then everyone could see the special page names in their own language in the interface.
There is some magic on it, so the link [[Special:Allpages]] would become %ED%8A%B9%EC%88%98%EA%B8%B0%EB%8A%A5:Allpages (without redirects!), making that poor user happier :D (well, i admit :( it's not chinese but ko...) ;)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel Arnold wrote: [snip]
In the past MediaWiki was changed several times without checking back wether the change affects language presentation in multilingual wikis:
- One sudden change for example was that "MediaWiki:Monobook.js" stopped from
being localisable via MediaWiki:Monobook.js/$ISO-Language-code.
That was a bug! Don't rely on unintended bugs -- and if they do break, let us know. We can't read minds and might not know that you're using the bug. ;)
The interface points to several internal pages
<very long issue of some kind>
Can you summarize this? It's a page long and has no paragraph breaks.
<many other issues>
Can you treat these in separate mails perhaps so they can be discussed individually? A big megathread is not easy to read or contribute to.
- -- brion vibber (brion @ pobox.com / brion @ wikimedia.org)
wikitech-l@lists.wikimedia.org