[Mediawiki-l] Sudden problem with some greek and cyrillic letters

Sylvain Machefert iubito at gmail.com
Fri May 4 15:03:48 UTC 2007


Hi Brion,
what is strange, is that only the titles are affected, not the content of
the pages. Is that normal ?

I made a little script which list the content of erroneous titles, i.e. :
ID      | utf8_decode                             | what is in the DB
1036 | Στης πίκ?ώя �Ď юގՏ?όνησα | Στης πίκÏ?ας τα
ξεÏ?όνησα
1039 | Συννεφιασμένη_Κυ?ώَюڎ         |
ΣυννεφιασμÎνη_ΚυÏ?ιακÎ(r)
3597 | Από_β?ώюԏ?ς_ξεκίνησα        | Î'πό_βÏ?αδÏ?Ï‚_ξεκίνησα
...
it happens on the greek "ro" which is utf8_encoded by Ï and a "?" in a
lozange (�).
Now it's coded with Ï and ? which turns back to "?" and lags.
In the 3597, there are 2 "ro" which lag 2 times, so the beginning (before ?)
and the end (after ?) is OK, between the two "?" is bad.
The utf8_decode of what is the DB is bad between the "?" --> ώюԏ
Searching for "ξεκίνησα" on my wiki get the result : Από β�?αδ�?ς ξεκίνησα,
but the article is not visible

The only thing I found is to update the database, set the title to TMP.
In the wiki, rename the page TMP to right name, and delete the TMP.
Do you see how can I solve it easier ?

2007/5/4, Brion Vibber <brion at wikimedia.org>:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Sylvain Machefert wrote:
> > Today, a lot of article titles with latin
> > diacritics, greek or cyrillic letters have problem.
> > See for example the list of article in page
> > http://tousauxbalkans.jexiste.fr/Bulgarie
> >
> > My host often upgrade linux core, php or mysql. I only see one of these
> > upgrade as cause of my problem.
> > Today here is my version :
> http://tousauxbalkans.jexiste.fr/Special:Version
> > * MediaWiki: 1.9.3
> > * PHP: 5.2.0-8+etch3 (cgi-fcgi)
> > * MySQL: 4.1.11-Debian_4sarge7-log
>
> You or your host may have corrupted your data by dumping it with
> mysqldump without the proper options, thus causing data-loss with the
> lossy charset converstion (latin1->utf8->latin1) which damages the UTF-8
> data stored in the fields.
>
> - -- brion vibber (brion @ wikimedia.org)
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFGOzUowRnhpk1wk44RAl3QAJ9vUrA/mF47GLkEF8XCZQsmZpDf2wCeL0em
> 2tMqAIpr9V2hO7uk+L35epw=
> =R4/i
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l at lists.wikimedia.org
> http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>



-- 
Sylvain Machefert - en Roumanie du 25 avril au 2 mai -
http://iubito.free.fr
http://tousauxbalkans.jexiste.fr


More information about the MediaWiki-l mailing list