Hi Brion,
what is strange, is that only the titles are affected, not the content of
the pages. Is that normal ?
I made a little script which list the content of erroneous titles, i.e. :
ID | utf8_decode | what is in the DB
1036 | Στης πίκ?ώя �Ď юގՏ?όνησα | Στης πίκÏ?ας τα
ξεÏ?όνησα
1039 | Συννεφιασμένη_Κυ?ώَюڎ |
ΣυννεφιασμÎνη_ΚυÏ?ιακÎ(r)
3597 | Από_β?ώюԏ?ς_ξεκίνησα | Î'πό_βÏ?αδÏ?Ï‚_ξεκίνησα
...
it happens on the greek "ro" which is utf8_encoded by Ï and a "?" in
a
lozange (�).
Now it's coded with Ï and ? which turns back to "?" and lags.
In the 3597, there are 2 "ro" which lag 2 times, so the beginning (before ?)
and the end (after ?) is OK, between the two "?" is bad.
The utf8_decode of what is the DB is bad between the "?" --> ώюԏ
Searching for "ξεκίνησα" on my wiki get the result : Από β�?αδ�?ς ξεκίνησα,
but the article is not visible
The only thing I found is to update the database, set the title to TMP.
In the wiki, rename the page TMP to right name, and delete the TMP.
Do you see how can I solve it easier ?
2007/5/4, Brion Vibber <brion(a)wikimedia.org>rg>:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Sylvain Machefert wrote:
Today, a lot of article titles with latin
diacritics, greek or cyrillic letters have problem.
See for example the list of article in page
http://tousauxbalkans.jexiste.fr/Bulgarie
My host often upgrade linux core, php or mysql. I only see one of these
upgrade as cause of my problem.
Today here is my version :
http://tousauxbalkans.jexiste.fr/Special:Version
* MediaWiki: 1.9.3
* PHP: 5.2.0-8+etch3 (cgi-fcgi)
* MySQL: 4.1.11-Debian_4sarge7-log
You or your host may have corrupted your data by dumping it with
mysqldump without the proper options, thus causing data-loss with the
lossy charset converstion (latin1->utf8->latin1) which damages the UTF-8
data stored in the fields.
- -- brion vibber (brion @
wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (Darwin)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org
iD8DBQFGOzUowRnhpk1wk44RAl3QAJ9vUrA/mF47GLkEF8XCZQsmZpDf2wCeL0em
2tMqAIpr9V2hO7uk+L35epw=
=R4/i
-----END PGP SIGNATURE-----
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Sylvain Machefert - en Roumanie du 25 avril au 2 mai -
http://iubito.free.fr
http://tousauxbalkans.jexiste.fr