Unicode's bidirectional algorithm often fails where there are RTL characters, LTR characters and neutrals such as punctuation in the same paragraph. Often this can be fixed by liberal sprinkling of either the RLM character (in base RTL text) or the LTR character (in base LTR text).
Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's ‎ and ‏ character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for ‏, meaning that in cases where ‏ is used, the text remains broken on edit while being fixed on display. Here's an example:
http://he.wikipedia.org/wiki/ACID
What I propose is that someone should come up with a translation of "rlm" into Hebrew, Arabic or both, and that we should implement this artificial character entity in the MediaWiki parser.
-- Tim Starling
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Tim Starling wrote:
Unicode's bidirectional algorithm often fails where there are RTL characters, LTR characters and neutrals such as punctuation in the same paragraph. Often this can be fixed by liberal sprinkling of either the RLM character (in base RTL text) or the LTR character (in base LTR text).
Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's ‎ and ‏ character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for ‏, meaning that in cases where ‏ is used, the text remains broken on edit while being fixed on display. Here's an example:
http://he.wikipedia.org/wiki/ACID
What I propose is that someone should come up with a translation of "rlm" into Hebrew, Arabic or both, and that we should implement this artificial character entity in the MediaWiki parser.
-- Tim Starling
Possible Hebrew translation (in fact, transliteration, since other options make no sense) is "רלמ".
On 4/28/07, Tim Starling tstarling@wikimedia.org wrote:
Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's ‎ and ‏ character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for ‏, meaning that in cases where ‏ is used, the text remains broken on edit while being fixed on display. Here's an example:
An interesting solution. It's a shame we have to expose this technical gibberish to editors, but until WYSIWYG I guess it's our only option. Would it be best to alias all the RTL terms for &rtl; so they work in any wiki, or subst to the content language, or just not let foreign ones work?
On 4/28/07, Rotem Liss rotemliss_net@fastmail.fm wrote:
Possible Hebrew translation (in fact, transliteration, since other options make no sense) is "רלמ".
"rlm" stands for "right-to-left mark", so I guess it could be translated סימן ימין לשמאל or something, and abbreviated סילש. Not that that would be particularly more enlightening in any case (probably more confusing than a transliteration if anything).
Simetrical wrote:
On 4/28/07, Tim Starling tstarling@wikimedia.org wrote:
Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's ‎ and ‏ character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for ‏, meaning that in cases where ‏ is used, the text remains broken on edit while being fixed on display. Here's an example:
An interesting solution. It's a shame we have to expose this technical gibberish to editors, but until WYSIWYG I guess it's our only option.
Even without full WYSIWYG, the RTL wikis would benefit from some amount of scripted editing assistance, such as inserting LRE/PDF codes around HTML tags in the edit window, while somehow preserving cursor movement and leaving display unaffected. But it's certainly more complicated to implement than translation.
The RTL wikis would in fact benefit from pervasive translation or transliteration of all of HTML and CSS, as well as the remaining untranslatable elements of wikitext such as <nowiki>.
Would it be best to alias all the RTL terms for &rtl; so they work in any wiki, or subst to the content language, or just not let foreign ones work?
Assuming you mean ‏, yes I think the aliases (or at least the more common ones should we choose to do this for smaller languages) should work on all wikis. Use of ‏ is quite rare in left-to-right text, but whatever use there is for it will be better served by using the translation. We could add it to edittools. And of course it will be useful for the multilingual wikis like Commons.
On 4/28/07, Rotem Liss rotemliss_net@fastmail.fm wrote:
Possible Hebrew translation (in fact, transliteration, since other options make no sense) is "רלמ".
"rlm" stands for "right-to-left mark", so I guess it could be translated סימן ימין לשמאל or something, and abbreviated סילש. Not that that would be particularly more enlightening in any case (probably more confusing than a transliteration if anything).
Understood.
-- Tim Starling
The two suggested aliases for ‏, &רלמ; and &رلم; have been enabled experimentally on Wikimedia wikis. Tell me what you think.
-- Tim Starling
Tim Starling wrote:
The two suggested aliases for ‏, &רלמ; and &رلم; have been enabled experimentally on Wikimedia wikis. Tell me what you think.
-- Tim Starling
Thanks Tim!
But unfortunately I cannot tell you what I think as I don't know how to use it or what I expect from it ;) I want to understand.. so that I can write about it and other users could use it too...
&alnokta
On 5/4/07, Mohamed Magdy mohamed.m.k@gmail.com wrote:
Tim Starling wrote:
The two suggested aliases for ‏, &רלמ; and &رلم; have been enabled experimentally on Wikimedia wikis. Tell me what you think.
-- Tim Starling
Thanks Tim!
But unfortunately I cannot tell you what I think as I don't know how to use it or what I expect from it ;) I want to understand.. so that I can write about it and other users could use it too...
See [[Right-to-left mark]]
http://en.wikipedia.org/wiki/Right-to-left_mark
-- John
Tim Starling wrote:
Unicode's bidirectional algorithm often fails where there are RTL characters, LTR characters and neutrals such as punctuation in the same paragraph. Often this can be fixed by liberal sprinkling of either the RLM character (in base RTL text) or the LTR character (in base LTR text).
Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's ‎ and ‏ character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for ‏, meaning that in cases where ‏ is used, the text remains broken on edit while being fixed on display. Here's an example:
I'm not seeing a difference :(
What I propose is that someone should come up with a translation of "rlm" into Hebrew, Arabic or both, and that we should implement this artificial character entity in the MediaWiki parser.
You will need two or even more translations :) one for each language (ar,fa,he,yi,ur,ug,ku)
The Arabic one..a transliteration would be "رلم" ...a translation would be "علامة يمين إلى شمال" an abbreviation would be "عيش".. choose what you prefer :)
-- Tim Starling
Thanks Tim for your efforts..I appreciate it :)
wikitech-l@lists.wikimedia.org