Hey wikitech people,
I have a press enquiry asking about how we run multilingual Wikipedias as an example of website localisation that has been done well. MediaWiki was designed from the start to be inherently multilingual, or so I have gathered, and I was wondering if anyone on this list has any thoughts I should include.
I'm referring to the software being international, rather than having been internationalised. Some not-too-technical commentary would be great to paraphrase into my reply. I am covering the social aspects of translation too, but our technical innovations are important.
Thanks, [[m:User:Sean Whitton]]
On 14/07/07, Sean Whitton sean@silentflame.com wrote:
I have a press enquiry asking about how we run multilingual Wikipedias as an example of website localisation that has been done well. MediaWiki was designed from the start to be inherently multilingual, or so I have gathered, and I was wondering if anyone on this list has any thoughts I should include.
Your impression is incorrect; MediaWiki at present contains no support for content in multiple languages in the same wiki. This is being worked on in the Multilingual MediaWiki project, although it is at present unclear how much of this can/will be incorporated into the core software.
We operate different web sites using separate databases and some clever configuration selection, which I believe Tim Starling has documented elsewhere; check Meta for "wiki farm" and similar search phrases. However, these do all point to different sets of content altogether.
There are some limited features in the software which allow it to recognise special link forms, e.g. [[en:Foo]], where "en" is a valid interwiki link, and also recognised as a language link; this causes the "in other languages" panel on the left side of a rendered page to be updated, however, this is just a link, and the individual MediaWiki installations aren't "aware" of each other as such.
Rob Church
Ack, sorry, I think I probably used the wrong terms there and you ended up with the wrong impression. I know that wikis are designed to be seperate in seperate languages (I guess meta handles an exception to this reasonably well, as does Foundation wiki). What I meant to say is that MediaWiki can easily have its language changed - we have MediaWiki: namespace pages to allow volunteer admins to keep things up to date which are really cool, and in general the project is international by nature, rather than being hacked to work in other languages, thus is more flexible. Or so I have been led to believe.
Any thoughts on this?
Sorry I got my terminology wrong, and thanks for your reply.
On 14/07/07, Rob Church robchur@gmail.com wrote:
On 14/07/07, Sean Whitton sean@silentflame.com wrote:
I have a press enquiry asking about how we run multilingual Wikipedias as an example of website localisation that has been done well. MediaWiki was designed from the start to be inherently multilingual, or so I have gathered, and I was wondering if anyone on this list has any thoughts I should include.
Your impression is incorrect; MediaWiki at present contains no support for content in multiple languages in the same wiki. This is being worked on in the Multilingual MediaWiki project, although it is at present unclear how much of this can/will be incorporated into the core software.
We operate different web sites using separate databases and some clever configuration selection, which I believe Tim Starling has documented elsewhere; check Meta for "wiki farm" and similar search phrases. However, these do all point to different sets of content altogether.
There are some limited features in the software which allow it to recognise special link forms, e.g. [[en:Foo]], where "en" is a valid interwiki link, and also recognised as a language link; this causes the "in other languages" panel on the left side of a rendered page to be updated, however, this is just a link, and the individual MediaWiki installations aren't "aware" of each other as such.
Rob Church
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Sean Whitton wrote:
Ack, sorry, I think I probably used the wrong terms there and you ended up with the wrong impression. I know that wikis are designed to be seperate in seperate languages (I guess meta handles an exception to this reasonably well, as does Foundation wiki). What I meant to say is that MediaWiki can easily have its language changed - we have MediaWiki: namespace pages to allow volunteer admins to keep things up to date which are really cool, and in general the project is international by nature, rather than being hacked to work in other languages, thus is more flexible. Or so I have been led to believe.
Any thoughts on this?
Sorry I got my terminology wrong, and thanks for your reply.
Well, it's not very similar to any other language: language strings are localised on an specific file and translating that file translates the program (as opposed to have all the strings hardcoded).
MediaWiki's has a pair of peculiarities: some messages are marked as being part of the content, when viewing the wiki on another language the message uses the one on the wiki's language (we don't want to get translated: 'Upload only files with a free license, following policies [[X]], [[Y]] and [[Z]] of our wiki' -> 'Press here to upload any file') and the (configurable) ability of translating messages as if they were articles, with the MediaWiki: namespace (and its subpages!).
Hoi, Have a look at the development version of MultiLingual MediaWiki.. It does do proper support for RtL for Arab as well.
Thanks, GerardM
http://mw.visc.us/index.php?title=Main_Page
On 7/14/07, Sean Whitton sean@silentflame.com wrote:
Ack, sorry, I think I probably used the wrong terms there and you ended up with the wrong impression. I know that wikis are designed to be seperate in seperate languages (I guess meta handles an exception to this reasonably well, as does Foundation wiki). What I meant to say is that MediaWiki can easily have its language changed - we have MediaWiki: namespace pages to allow volunteer admins to keep things up to date which are really cool, and in general the project is international by nature, rather than being hacked to work in other languages, thus is more flexible. Or so I have been led to believe.
Any thoughts on this?
Sorry I got my terminology wrong, and thanks for your reply.
On 14/07/07, Rob Church robchur@gmail.com wrote:
On 14/07/07, Sean Whitton sean@silentflame.com wrote:
I have a press enquiry asking about how we run multilingual Wikipedias as an example of website localisation that has been done well. MediaWiki was designed from the start to be inherently multilingual, or so I have gathered, and I was wondering if anyone on this list has any thoughts I should include.
Your impression is incorrect; MediaWiki at present contains no support for content in multiple languages in the same wiki. This is being worked on in the Multilingual MediaWiki project, although it is at present unclear how much of this can/will be incorporated into the core software.
We operate different web sites using separate databases and some clever configuration selection, which I believe Tim Starling has documented elsewhere; check Meta for "wiki farm" and similar search phrases. However, these do all point to different sets of content altogether.
There are some limited features in the software which allow it to recognise special link forms, e.g. [[en:Foo]], where "en" is a valid interwiki link, and also recognised as a language link; this causes the "in other languages" panel on the left side of a rendered page to be updated, however, this is just a link, and the individual MediaWiki installations aren't "aware" of each other as such.
Rob Church
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Regards, —Sean Whitton (seanw) http://seanwhitton.com/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi,
some of the things that I've learned while working on Mediawiki I18N: * I18N is not just translating messages. There's more: * Different numbers are used by different languages * In some languages, you have to take care of grammatical cases, e.g. genitive, which is hard to notice if you don't speak those languages. * Some languages don't use italics. Don't hardcode <i> to emphasize something. * Some languages have prefixes to words. In English, there's the 's' suffix to make a noun plural, other languages use prefixes for this, or they combine the article and the noun. Links have to work different on those wikis. * Words are not always separated by whitespace. Some languages have one letter per word. Links should not extend to following characters like they do in English (e.g. [[house]]s) * Unicode support is a well of never ending joy * Writing direction switches don't work as you'd expect, esp. when users on RTL wikis use LTR usernames. E.g. when you have a number, a username and a timestamp and don't take care, the timestamp might by right or left of the username, depending on the username (Remember: "Arabic" digits (123..) are LTR). * Some letters look the same, but have different codepoints. A (Latin Capital Letter a) and A (Greek Capital Letter alpha). This allows the creation of user names and pages which resemble already existing pages/users. Nice for vandals. * There are different ways to encode the same character. Diaecritical letters (like the German Umlauts) have their own codepoint, but can also be combined using a base letter and a diaecritical sign. Lot's of fun if you upload images from a Mac and try to insert them in an article from a Windows/Linux computer. Don't forget to normalize all user input. * PHP's support for Unicode string manipulation is poor. Many of the above mentioned problems forced us to write custom extensions for PHP.
MediaWiki started with internationalization in mind, but the authors of it were only speaking western languages (English, German, French, Spanish, Esperanto) and many of the problems mentioned above required code changes. There are still open issues. E.g., in piedmontese, there's a conflict between certain language constructs requiring multiple ticks (') and the wiki syntax for bold/italics.
So yes, MediaWiki is quite good when it comes to internationalization, but it's still far from perfect.
Regards,
jens
On 15/07/07, Jens Frank jf@mormo.org wrote:
So yes, MediaWiki is quite good when it comes to internationalization, but it's still far from perfect.
OTOH, I suspect even having awareness of these problems as being serious problems probably puts us near the forefront of internationalisation of software, at least starting from Latin. Does anyone else even do Piedmontese, for instance?
- d.
Hoi, Piedmontese lacks two characters in Unicode. Neapolitan has a problem with the double codes.
Both are vibrant projects.
PS I do not always react to things that are plain wrong..
Thanks, GerardM
On 7/16/07, David Gerard dgerard@gmail.com wrote:
On 15/07/07, Jens Frank jf@mormo.org wrote:
So yes, MediaWiki is quite good when it comes to internationalization, but it's still far from perfect.
OTOH, I suspect even having awareness of these problems as being serious problems probably puts us near the forefront of internationalisation of software, at least starting from Latin. Does anyone else even do Piedmontese, for instance?
- d.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org