Can somebody please look into this? This was adaquately described by Timwi on this list and I include the final message from the discussion. I certainly tried to correct the problem with reediting and correcting the offending messages but alas! This behaviour appears only in the new monobook interface, it works correctly if the user switches to other styles. Please help.
Thanx ank
Brion Vibber wrote:
Timwi wrote:
On second inspection, it definitely does *not* look fine! http://meta.wikipedia.org/upload/3/38/Greek_problem_screenshot.png This seems to be a problem with the software. The messages in the MediaWiki namespace of [[el:]] are correct.
Someone seriously needs to fix that, quickly.
Use the letter "i" instead of the number "1" in "π" and "φ".
Brion, if you read my message, you will see that I said "The messages in the MediaWiki namespace of [[el:]] are correct." They are using neither "π" nor "&p1;", but the correct UTF-8 character for Greek Pi.
Please fix this. (When Wikipedia is back up ...)
Thanks, Timwi
Andreas Kasenides wrote:
Can somebody please look into this? This was adaquately described by Timwi on this list and I include the final message from the discussion. I certainly tried to correct the problem with reediting and correcting the offending messages but alas! This behaviour appears only in the new monobook interface, it works correctly if the user switches to other styles.
Okay, I managed to track this one down... turns out it's actually a PHP bug, which is triggered by the use of the htmlentities() function in the PHPTal template engine. PHP 4.3.7, released about a week ago, lists this as a bug fix for that release, but I've worked around it temporarily by switching to htmlspecialchars() which doesn't mess with the main part of the text.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Andreas Kasenides wrote:
Can somebody please look into this? This was adaquately described by Timwi on this list and I include the final message from the discussion. I certainly tried to correct the problem with reediting and correcting the offending messages but alas! This behaviour appears only in the new monobook interface, it works correctly if the user switches to other styles.
Okay, I managed to track this one down... turns out it's actually a PHP bug, which is triggered by the use of the htmlentities() function in the PHPTal template engine. PHP 4.3.7, released about a week ago, lists this as a bug fix for that release, but I've worked around it temporarily by switching to htmlspecialchars() which doesn't mess with the main part of the text.
Could you describe to us who are technically interested but not very familiar with the code ;-) what exactly was wrong?
What I don't understand about it is why the text needs to go through any processing at all that would generate any HTML entities whatsoever. Why can't the UTF-8 text in the MediaWiki namespace be output directly?
Thanks, Timwi
On Thu, 2004-06-10 at 15:07 +0100, Timwi wrote:
What I don't understand about it is why the text needs to go through any processing at all that would generate any HTML entities whatsoever. Why can't the UTF-8 text in the MediaWiki namespace be output directly?
Things like & need to be escaped for xhtml validation. This is done automatically by phptal for attributes and content if you don't specify the 'structure' prefix and by tidy (if enabled) or php regexes for the content.
Gabriel Wicke
Gabriel Wicke wrote:
On Thu, 2004-06-10 at 15:07 +0100, Timwi wrote:
What I don't understand about it is why the text needs to go through any processing at all that would generate any HTML entities whatsoever. Why can't the UTF-8 text in the MediaWiki namespace be output directly?
Things like & need to be escaped for xhtml validation. This is done automatically by phptal for attributes and content if you don't specify the 'structure' prefix and by tidy (if enabled) or php regexes for the content.
Oh, OK, I forgot about that ... but doesn't that only affect &, < and >, and at most possibly also " and '? I don't see why Greek letters would need to be entity-ised.
Timwi
Timwi wrote:
Oh, OK, I forgot about that ... but doesn't that only affect &, < and >, and at most possibly also " and '? I don't see why Greek letters would need to be entity-ised.
PHP has two separate functions for escaping things to html: * htmlspecialchars() just does &, <, >, and quotes * htmlentities() does everything it possibly can
htmlentities() might be useful if you're not sure what the charset encoding of the final output will be. However, since we *do* know we don't need that level of conversion. That's why I switched the instance of it in the PHPTal template code to use htmlspecialchars(), which doesn't touch the greek letters and so doesn't trigger the bug in htmlentities().
This isn't code we wrote ourselves, so don't ask us why they used that function. ;)
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Timwi wrote:
Oh, OK, I forgot about that ... but doesn't that only affect &, < and
, and at most possibly also " and '? I don't see why Greek letters
would need to be entity-ised.
htmlentities() might be useful if you're not sure what the charset encoding of the final output will be. However, since we *do* know we don't need that level of conversion. [... etc. ...] This isn't code we wrote ourselves, so don't ask us why they used that function. ;)
Thank you. Very clear explanation. Dankon, Brion. :-)
Timwi
Thank you thank you for a quick fix. It looks Greek now.
Andreas
Andreas Kasenides wrote:
Can somebody please look into this? This was adaquately described by Timwi on this list and I include the final message from the discussion. I certainly tried to correct the problem with reediting and correcting the offending messages but alas! This behaviour appears only in the new monobook interface, it works correctly if the user switches to other styles. Please help.
Thanx ank
Brion Vibber wrote:
Timwi wrote:
On second inspection, it definitely does *not* look fine! http://meta.wikipedia.org/upload/3/38/Greek_problem_screenshot.png This seems to be a problem with the software. The messages in the MediaWiki namespace of [[el:]] are correct.
Someone seriously needs to fix that, quickly.
Use the letter "i" instead of the number "1" in "π" and "φ".
Brion, if you read my message, you will see that I said "The messages in the MediaWiki namespace of [[el:]] are correct." They are using neither "π" nor "&p1;", but the correct UTF-8 character for Greek Pi.
Please fix this. (When Wikipedia is back up ...)
Thanks, Timwi
--
Andreas Kasenides e-mail: Andreas.Kasenides_at_cs.ucy.ac.cy mailto:Andreas.Kasenides%20at%20cs.ucy.ac.cy (replace the _at_ above with @)
Wikitech-l mailing list Wikitech-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org