Hi Brion, what is strange, is that only the titles are affected, not the content of the pages. Is that normal ?
I made a little script which list the content of erroneous titles, i.e. : ID | utf8_decode | what is in the DB 1036 | Στης πίκ?ώя �Ď юގՏ?όνησα | Στης πίκÏ?ας τα ξεÏ?όνησα 1039 | Συννεφιασμένη_Κυ?ώَюڎ | ΣυννεφιασμÎνη_ΚυÏ?ιακÎ(r) 3597 | Από_β?ώюԏ?ς_ξεκίνησα | Î'πό_βÏ?αδÏ?Ï‚_ξεκίνησα ... it happens on the greek "ro" which is utf8_encoded by Ï and a "?" in a lozange (�). Now it's coded with Ï and ? which turns back to "?" and lags. In the 3597, there are 2 "ro" which lag 2 times, so the beginning (before ?) and the end (after ?) is OK, between the two "?" is bad. The utf8_decode of what is the DB is bad between the "?" --> ώюԏ Searching for "ξεκίνησα" on my wiki get the result : Από β�?αδ�?ς ξεκίνησα, but the article is not visible
The only thing I found is to update the database, set the title to TMP. In the wiki, rename the page TMP to right name, and delete the TMP. Do you see how can I solve it easier ?
2007/5/4, Brion Vibber brion@wikimedia.org:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Sylvain Machefert wrote:
Today, a lot of article titles with latin diacritics, greek or cyrillic letters have problem. See for example the list of article in page http://tousauxbalkans.jexiste.fr/Bulgarie
My host often upgrade linux core, php or mysql. I only see one of these upgrade as cause of my problem. Today here is my version :
http://tousauxbalkans.jexiste.fr/Special:Version
- MediaWiki: 1.9.3
- PHP: 5.2.0-8+etch3 (cgi-fcgi)
- MySQL: 4.1.11-Debian_4sarge7-log
You or your host may have corrupted your data by dumping it with mysqldump without the proper options, thus causing data-loss with the lossy charset converstion (latin1->utf8->latin1) which damages the UTF-8 data stored in the fields.
- -- brion vibber (brion @ wikimedia.org)
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGOzUowRnhpk1wk44RAl3QAJ9vUrA/mF47GLkEF8XCZQsmZpDf2wCeL0em 2tMqAIpr9V2hO7uk+L35epw= =R4/i -----END PGP SIGNATURE-----
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l