So, a designer here at the foundation just found one of their designs doesn't work so well for German, which has significant "text swell" versus English.
That just reminded me of a technique I've seen used elsewhere, pseudolocalization. Basically the system has a pseudo-locale, something like "xx-pseudo", where
"Some text"
is automatically transformed to something like
"____ §ømë †ëx† ____"
The idea is that you can test for three things even before the translators get to it: - which parts of the system aren't internationalized yet (they are noticeable since they don't get this pseudolocalization) - text swell (150%-200% is a good ratio) - any transformations that don't preserve unicode
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
On 9/22/10 10:51 AM, Neil Kandalgaonkar wrote:
Does MediaWiki or interwiki do anything like this?
Sorry, I meant to say translatewiki there.
On Wed, Sep 22, 2010 at 17:54, Neil Kandalgaonkar neilk@wikimedia.org wrote:
On 9/22/10 10:51 AM, Neil Kandalgaonkar wrote:
Does MediaWiki or interwiki do anything like this?
Sorry, I meant to say translatewiki there.
That sort of thing should be implemented in MediaWiki or other client applications, not Translatewiki. In MediaWiki's case some $wg variable that the wfMsg*() functions use.
On 22 September 2010 20:51, Neil Kandalgaonkar neilk@wikimedia.org wrote:
So, a designer here at the foundation just found one of their designs doesn't work so well for German, which has significant "text swell" versus English.
The idea is that you can test for three things even before the translators get to it: - which parts of the system aren't internationalized yet (they are noticeable since they don't get this pseudolocalization)
Such code wouldn't likely survive in the code review.
- text swell (150%-200% is a good ratio)
Web layout are usually very liquid, and thus we shouldn't have that much of a problem with long texts. Of course there are cases like wide tables where we run into problems.
- any transformations that don't preserve unicode
Such as? For what it's worth, double html-escaping or not escaping at all is far more common problem.
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
It is probably impossible to do this, for the same reason we cannot do 'click this interface message to translate it on sight'. There is too much configuration stuff and other things abusing messages that break MediaWiki if the input is unexpected.
- Niklas
On 9/22/10 12:54 PM, Niklas Laxström wrote:
On 22 September 2010 20:51, Neil Kandalgaonkarneilk@wikimedia.org wrote:
The idea is that you can test for three things even before the translators get to it:
- which parts of the system aren't internationalized yet (they are
noticeable since they don't get this pseudolocalization)
Such code wouldn't likely survive in the code review.
But the idea is that the *developer* should catch it themselves, before review.
- text swell (150%-200% is a good ratio)
Web layout are usually very liquid, and thus we shouldn't have that much of a problem
And yet, we had such a problem today with the release of Article Feedback.
The layout worked fine when it said "2 ratings" but not when it was "2 Einschätzungen". The layout is still readable, but it doesn't flow the way they expected.
- any transformations that don't preserve unicode
Such as?
Hard to say. Currently PHP doesn't have a lot of gotchas like that, but I know they exist for other languages. Possibly they could still arise if someone interacted with extensions which involved non-PHP components.
For what it's worth, double html-escaping or not escaping at all is far more common problem.
Agreed, but I'm addressing a different issue.
(Also, I don't know how to solve that one in code. Do we even have a consistent policy on where escaping is supposed to happen? It seems to be all over the place.)
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
It is probably impossible to do this, for the same reason we cannot do 'click this interface message to translate it on sight'. There is too much configuration stuff and other things abusing messages that break MediaWiki if the input is unexpected.
Maybe I'm not being clear about what I would like to do. Something like this:
The idea is to allow a developer to quickly see how their page might look in another language -- without learning that language, waiting for a translation, or otherwise involving anyone else.
A typical example is a monolingual English developer. They can flip their language prefs into xx-pseudo to see if their layout is fully localized, and works with text swell.
How we do this: in MediaWiki, we reserve the language xx-pseudo to mean pseudolocalization. If it is selected, instead of looking up the string in the appropriate message file, we look up the string in English and then apply a fast transform to swap certain characters and add padding. The only complicated part is making sure we don't affect embedded markup like {{PLURAL}}. It could be done in real time, without any stored message file.
We could even have other pseudos to mimic Chinese or Arabic.
This does privilege the English-speaking developer, but they are typically the biggest problem. ;)
Hoi, There are scripts that need more space to be legible.. There are scripts that are top down.. development for this is under way to support the SignWriting script in MediaWiki. This allow for the writing of sign languages, Thanks, GerardM
On 22 September 2010 23:59, Neil Kandalgaonkar neilk@wikimedia.org wrote:
On 9/22/10 12:54 PM, Niklas Laxström wrote:
On 22 September 2010 20:51, Neil Kandalgaonkarneilk@wikimedia.org
wrote:
The idea is that you can test for three things even before the translators get to it:
- which parts of the system aren't internationalized yet (they are
noticeable since they don't get this pseudolocalization)
Such code wouldn't likely survive in the code review.
But the idea is that the *developer* should catch it themselves, before review.
- text swell (150%-200% is a good ratio)
Web layout are usually very liquid, and thus we shouldn't have that much of a problem
And yet, we had such a problem today with the release of Article Feedback.
The layout worked fine when it said "2 ratings" but not when it was "2 Einschätzungen". The layout is still readable, but it doesn't flow the way they expected.
- any transformations that don't preserve unicode
Such as?
Hard to say. Currently PHP doesn't have a lot of gotchas like that, but I know they exist for other languages. Possibly they could still arise if someone interacted with extensions which involved non-PHP components.
For what it's worth, double html-escaping or not escaping at all is far more common problem.
Agreed, but I'm addressing a different issue.
(Also, I don't know how to solve that one in code. Do we even have a consistent policy on where escaping is supposed to happen? It seems to be all over the place.)
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
It is probably impossible to do this, for the same reason we cannot do 'click this interface message to translate it on sight'. There is too much configuration stuff and other things abusing messages that break MediaWiki if the input is unexpected.
Maybe I'm not being clear about what I would like to do. Something like this:
The idea is to allow a developer to quickly see how their page might look in another language -- without learning that language, waiting for a translation, or otherwise involving anyone else.
A typical example is a monolingual English developer. They can flip their language prefs into xx-pseudo to see if their layout is fully localized, and works with text swell.
How we do this: in MediaWiki, we reserve the language xx-pseudo to mean pseudolocalization. If it is selected, instead of looking up the string in the appropriate message file, we look up the string in English and then apply a fast transform to swap certain characters and add padding. The only complicated part is making sure we don't affect embedded markup like {{PLURAL}}. It could be done in real time, without any stored message file.
We could even have other pseudos to mimic Chinese or Arabic.
This does privilege the English-speaking developer, but they are typically the biggest problem. ;)
-- Neil Kandalgaonkar ( ) neilk@wikimedia.org
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
On 23 September 2010 00:59, Neil Kandalgaonkar neilk@wikimedia.org wrote:
Maybe I'm not being clear about what I would like to do. Something like this:
The idea is to allow a developer to quickly see how their page might look in another language -- without learning that language, waiting for a translation, or otherwise involving anyone else.
How we do this: in MediaWiki, we reserve the language xx-pseudo to mean pseudolocalization. If it is selected, instead of looking up the string in the appropriate message file, we look up the string in English and then apply a fast transform to swap certain characters and add padding. The only complicated part is making sure we don't affect embedded markup like {{PLURAL}}. It could be done in real time, without any stored message file.
You are going to have problems with messages like sidebar, mainpage and others excepting certain kind of input.
-Nikas
On 9/24/10 12:49 AM, Niklas Laxström wrote:
You are going to have problems with messages like sidebar, mainpage and others excepting certain kind of input.
Oh, I see what you're saying now. You mean how MediaWiki abuses the message database for things which are more like configuration, e.g. MediaWiki:Sidebar.
Hm. I'm not sure how to fix that at all. Other than somehow cataloging all the message strings that are used like that.
Come to think of it, what do we do to ensure that these messages are not translated? Or does that not matter since such messages originate in a different place?
On 24 September 2010 20:00, Neil Kandalgaonkar neilk@wikimedia.org wrote:
On 9/24/10 12:49 AM, Niklas Laxström wrote:
You are going to have problems with messages like sidebar, mainpage and others excepting certain kind of input.
Oh, I see what you're saying now. You mean how MediaWiki abuses the message database for things which are more like configuration, e.g. MediaWiki:Sidebar.
Hm. I'm not sure how to fix that at all. Other than somehow cataloging all the message strings that are used like that.
Come to think of it, what do we do to ensure that these messages are not translated? Or does that not matter since such messages originate in a different place?
Some of them are translated (maybe just partially), some may be translated and some are not available for translation in translatewiki.net. See maintenance/language/messageTypes.inc for optional and ignored messages respectively. Unfortunately the criterion for those tags is exactly the same what we would need for this use case.
-Niklas
Στις 22-09-2010, ημέρα Τετ, και ώρα 10:51 -0700, ο/η Neil Kandalgaonkar έγραψε:
So, a designer here at the foundation just found one of their designs doesn't work so well for German, which has significant "text swell" versus English.
That just reminded me of a technique I've seen used elsewhere, pseudolocalization. Basically the system has a pseudo-locale, something like "xx-pseudo", where
"Some text"
is automatically transformed to something like
"____ §ømë †ëx† ____"
The idea is that you can test for three things even before the translators get to it:
- which parts of the system aren't internationalized yet (they are
noticeable since they don't get this pseudolocalization)
- text swell (150%-200% is a good ratio)
- any transformations that don't preserve unicode
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
Yes, actually, I think this would be a very useful feature. We have had issues with tab sizes, buttons in the layout and other such things where there is overlap or other surprises, so being able to catch that ahead of time would be good, at a minimum.
Ariel
Neil Kandalgaonkar wrote:
So, a designer here at the foundation just found one of their designs doesn't work so well for German, which has significant "text swell" versus English.
That just reminded me of a technique I've seen used elsewhere, pseudolocalization. Basically the system has a pseudo-locale, something like "xx-pseudo", where
"Some text"
is automatically transformed to something like
"____ §ømë †ëx† ____"
The idea is that you can test for three things even before the translators get to it:
- which parts of the system aren't internationalized yet (they are
noticeable since they don't get this pseudolocalization)
- text swell (150%-200% is a good ratio)
- any transformations that don't preserve unicode
Does MediaWiki or interwiki do anything like this? Would you like it if we did?
If you just want to check that you don't have literals there, we could set a language with no trasnlations, which always showed the <message-name>. That would be useful for message lookup by translators.
mediawiki-i18n@lists.wikimedia.org