[Mediawiki-l] Possible PHP bug causes page blanking in 1.9.3

Ian Smith ismith at good.com
Sat Mar 17 16:45:45 UTC 2007


I've already got PHP errors on (we have no security worries).

What I reported was that preg_replace_callback() is returning an empty string (or null? Haven't had time to get much detail yet) when passed an input string with possibly invalid UTF characters.  This happens when it is asked to do a replacement in UTF mode, even when it doesn't find any matches.   A consequence of this in MW is that the parser converts the stored version of an article to an empty string.

I don't think memory is an issue - the text is only 2K, and we handle much larger articles no problem.  Besides, as I reported earlier, simply turning off UTF matches in MagicWord stops it happening.

Unfortunately, I haven't had time to get a dump of exactly what's in the database, but I guess that there is an invalid UTF8 sequence there.

My question is, where is the bug? The unicode standard requires the detection and suppression of invalid sequences, but is it legit for preg to dump the whole text?

Also, is this an issue with how the upgrade from 1.6 to 1.9 worked; or is it still possible to enter invalid sequences into an article?  Seems like we either need better input filtering, or a database cleanup as part of the upgrade.  (or both.)

The problem is highly annoying, and I'm worried about other side-effects of my workaround.  Feedback on that from people who understand MW internals would be most welcome.

I guess I should do more investigation, buti thought I would bounce it off the list first, in case there are known issues here.  It'll have to wait in any case, as I'm going to be scrubbing out the bilge of my boat all weekend.  Help with that would be most welcome too... ;-)

Ian

Ian Smith
Motorola | Good Technology Group
ismith at motorola.com
408-352-7467
4250 Burton Drive, Santa Clara, CA 95054
www.motorola.com/good

Sent from my pocket PC using Good Mobile Messaging

 -----Original Message-----
From: 	Brion Vibber [mailto:brion at pobox.com]
Sent:	Saturday, March 17, 2007 06:38 AM Pacific Standard Time
To:	MediaWiki announcements and site admin list
Subject:	Re: [Mediawiki-l] Possible PHP bug causes page blanking in 1.9.3

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Smith wrote:
> Think you may have replied to the wrong email here... ;-)

No, but if you're getting blank pages, you probably want to look at the
error messages instead of throwing them in /dev/null. :)

- -- brion vibber (brion @ pobox.com / brion @ wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF++7cwRnhpk1wk44RAle+AKCM8LRlPZuZf649XxGWwxItZnvzpQCgxFlS
bH5d7JCGtCDAItKJsOx1Z6o=
=NGZr
-----END PGP SIGNATURE-----

_______________________________________________
MediaWiki-l mailing list
MediaWiki-l at lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l



More information about the MediaWiki-l mailing list