I've already got PHP errors on (we have no security worries).
What I reported was that preg_replace_callback() is returning an empty string (or null?
Haven't had time to get much detail yet) when passed an input string with possibly
invalid UTF characters. This happens when it is asked to do a replacement in UTF mode,
even when it doesn't find any matches. A consequence of this in MW is that the
parser converts the stored version of an article to an empty string.
I don't think memory is an issue - the text is only 2K, and we handle much larger
articles no problem. Besides, as I reported earlier, simply turning off UTF matches in
MagicWord stops it happening.
Unfortunately, I haven't had time to get a dump of exactly what's in the database,
but I guess that there is an invalid UTF8 sequence there.
My question is, where is the bug? The unicode standard requires the detection and
suppression of invalid sequences, but is it legit for preg to dump the whole text?
Also, is this an issue with how the upgrade from 1.6 to 1.9 worked; or is it still
possible to enter invalid sequences into an article? Seems like we either need better
input filtering, or a database cleanup as part of the upgrade. (or both.)
The problem is highly annoying, and I'm worried about other side-effects of my
workaround. Feedback on that from people who understand MW internals would be most
welcome.
I guess I should do more investigation, buti thought I would bounce it off the list first,
in case there are known issues here. It'll have to wait in any case, as I'm going
to be scrubbing out the bilge of my boat all weekend. Help with that would be most
welcome too... ;-)
Ian
Ian Smith
Motorola | Good Technology Group
ismith(a)motorola.com
408-352-7467
4250 Burton Drive, Santa Clara, CA 95054
www.motorola.com/good
Sent from my pocket PC using Good Mobile Messaging
-----Original Message-----
From: Brion Vibber [mailto:brion@pobox.com]
Sent: Saturday, March 17, 2007 06:38 AM Pacific Standard Time
To: MediaWiki announcements and site admin list
Subject: Re: [Mediawiki-l] Possible PHP bug causes page blanking in 1.9.3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Smith wrote:
Think you may have replied to the wrong email here...
;-)
No, but if you're getting blank pages, you probably want to look at the
error messages instead of throwing them in /dev/null. :)
- -- brion vibber (brion @
pobox.com / brion @
wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (Darwin)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org
iD8DBQFF++7cwRnhpk1wk44RAle+AKCM8LRlPZuZf649XxGWwxItZnvzpQCgxFlS
bH5d7JCGtCDAItKJsOx1Z6o=
=NGZr
-----END PGP SIGNATURE-----
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l