On Wednesday 16 June 2004 10:16, Nikola Smolenski wrote:
I am thinking
about an even simpler solution. Have server-side script
convert articles and their histories to UTF-8. Have a postprocessor
(written in C) tell if a page is in UTF-8 and change appropriate meta tag
if it is. It's vastly improbable that a UTF-8 page will not be in UTF-8,
it could be checked on a database dump and I don't believe that any such
page would be found. When all pages are converted, site could be switched
to UTF-8 and
postprocessor turned off.
This could even be doen without a postprocessor, there is PHP
mb_detect_encoding function which does exactly that.
Quick, dirty, and it seems to work. I know that buffering is slow, but this
would only be a temporary solution. When hairs on your head settle down,
let's talk about it :)
Index: OutputPage.php
===================================================================
RCS file: /cvsroot/wikipedia/phase3/includes/OutputPage.php,v
retrieving revision 1.143
diff -u -3 -p -r1.143 OutputPage.php
--- OutputPage.php 20 May 2004 12:46:31 -0000 1.143
+++ OutputPage.php 16 Jun 2004 12:31:10 -0000
@@ -333,7 +333,13 @@ class OutputPage {
setcookie( $name, $val, $exp, "/" );
}
+ ob_start();
$sk->outputPage( $this );
+ $output=ob_get_contents();
+ ob_end_clean();
+
if(mb_detect_encoding($output,"UTF-8,ISO-8859-1")=="UTF-8")
+
$output=preg_replace("/charset=iso-8859-1/","charset=utf-8",$output,1);
+ echo $output;
# flush();
}