It seems that one slow extension can bring MediaWiki to a halt. For example, if you define a <wait> tag that simply sleeps for 20 seconds, and you hit a page that contains it, no other MediaWiki pages can be served during those 20 seconds.
Other PHP pages on the same Apache server, however, work just fine during those 20 seconds, so I'd guess this is not an Apache or PHP configuration issue. Only MediaWiki pages are affected.
Although the <wait> tag is artificial, the situation is realistic. We have a parser tag that hits an external database, and when the connection is slow (for even ONE wiki page), no other wiki pages can be served.
This seems dangerous. What's happening, and what's the workaround? This is in 1.13.0. (And maybe it's my imagination, but the problem seemed less in 1.12.0.)
Here's my toy <wait> code:
<?php # Wait for N seconds $wgExtensionFunctions[] = 'wfWaitSetup'; function wfWaitSetup() { global $wgParser; $wgParser->setHook('wait', 'wfWait'); } function wfWait($input) { global $wgParser; $wgParser->disableCache(); sleep($input); return "Slept for $input seconds"; }
Thanks for any advice, DanB
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel Barrett wrote:
It seems that one slow extension can bring MediaWiki to a halt. For example, if you define a <wait> tag that simply sleeps for 20 seconds, and you hit a page that contains it, no other MediaWiki pages can be served during those 20 seconds.
Other PHP pages on the same Apache server, however, work just fine during those 20 seconds, so I'd guess this is not an Apache or PHP configuration issue. Only MediaWiki pages are affected.
Although the <wait> tag is artificial, the situation is realistic. We have a parser tag that hits an external database, and when the connection is slow (for even ONE wiki page), no other wiki pages can be served.
This seems dangerous. What's happening, and what's the workaround? This is in 1.13.0. (And maybe it's my imagination, but the problem seemed less in 1.12.0.)
I did a quick test with this on my local wiki; it looks like it may be session-related.
If I preview a page with <wait>20</wait>, then go load something up in another tab in the same browser, it sits there waiting on both tabs. (Confirmed with Firefox 3 and Safari 3 on Mac OS X.)
If on the other hand I go load things up in another browser, there's no delay there.
If I disable cookies (thus removing session affinity), then a second tab in the same browser has no slowdown.
And indeed, it appears that PHP session files are by default locked to prevent multiple simultaneous accesses.
- -- brion
Daniel Barrett wrote:
It seems that one slow extension can bring MediaWiki to a halt. For example, if you define a <wait> tag that simply sleeps for 20 seconds, and you hit a page that contains it, no other MediaWiki pages can be served during those 20 seconds.
Brion Vibber writes:
I did a quick test with this on my local wiki; it looks like it may be session-related.
If I preview a page with <wait>20</wait>, then go load something up in another tab in the same browser, it sits there waiting on both tabs. (Confirmed with Firefox 3 and Safari 3 on Mac OS X.)
If on the other hand I go load things up in another browser, there's no delay there.
Interesting. I tried this on our site, and using multiple browsers/sessions made no difference. Using Firefox 2 from host1 (hitting the <wait> page) and lynx from host2 (hitting any other page), both browsers waited the full 20 seconds.
And indeed, it appears that PHP session files are by default locked to prevent multiple simultaneous accesses.
Is this a PHP issue or a MediaWiki issue, and is there a workaround?
Should this be filed as a "priority 1" bug? Many of our users are getting timeouts due to this problem, ever since we upgraded to 1.13 from 1.12.
Thanks, DanB
FWIW: http://ca3.php.net/manual/en/function.session-write-close.php "Session data is usually stored after your script terminated without the need to call *session_write_close()*, but as session data is *locked* to prevent concurrent writes *only one script may operate on a session* at any time. When using framesets together with sessions you will experience the frames loading *one by one due to this locking*. You can reduce the time needed to load all the frames by ending the session as soon as all changes to session variables are done."
~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) --Games-G.P.S. (http://ggps.org) -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Daniel Barrett wrote:
Daniel Barrett wrote:
It seems that one slow extension can bring MediaWiki to a halt. For example, if you define a <wait> tag that simply sleeps for 20 seconds, and you hit a page that contains it, no other MediaWiki pages can be served during those 20 seconds.
Brion Vibber writes:
I did a quick test with this on my local wiki; it looks like it may be session-related.
If I preview a page with <wait>20</wait>, then go load something up in another tab in the same browser, it sits there waiting on both tabs. (Confirmed with Firefox 3 and Safari 3 on Mac OS X.)
If on the other hand I go load things up in another browser, there's no delay there.
Interesting. I tried this on our site, and using multiple browsers/sessions made no difference. Using Firefox 2 from host1 (hitting the <wait> page) and lynx from host2 (hitting any other page), both browsers waited the full 20 seconds.
And indeed, it appears that PHP session files are by default locked to prevent multiple simultaneous accesses.
Is this a PHP issue or a MediaWiki issue, and is there a workaround?
Should this be filed as a "priority 1" bug? Many of our users are getting timeouts due to this problem, ever since we upgraded to 1.13 from 1.12.
Thanks, DanB
I've confirmed that one slow MediaWiki page (due to a slow extension) blocks the entire webserver from serving any other MediaWiki pages. Not just a single session. This is on CentOS 5 Linux with PHP 5.1.6 and MediaWiki 1.13.0.
1. Hit a wiki article that uses my <wait> tag (previously described) that sleeps for 20 seconds. 2. Ask another user on another PC to hit any wiki page during those 20 seconds. The wiki is unresponsive during those 20 seconds. As soon as the <wait> ends, other pages can be served.
Any idea why this would be?
Long-lived extensions are realistic, e.g., those that hit an external resource that is slow to respond.
DanB
-----Original Message----- FWIW: http://ca3.php.net/manual/en/function.session-write-close.php "Session data is usually stored after your script terminated without the need to call *session_write_close()*, but as session data is *locked* to prevent concurrent writes *only one script may operate on a session* at any time. When using framesets together with sessions you will experience the frames loading *one by one due to this locking*. You can reduce the time needed to load all the frames by ending the session as soon as all changes to session variables are done."
~Daniel Friesen
Perhaps we should begin to look at this at a webserver level. Perhaps the webserver is blocking and can only handle one request at a time. Try testing this on a variety of webservers. Apache, lighttpd, nginx... I think Apache used worker processes, and I know that nginx does. So it may be a good idea to debug this on different numbers of worker processes.
~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) --Games-G.P.S. (http://ggps.org) -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Daniel Barrett wrote:
I've confirmed that one slow MediaWiki page (due to a slow extension) blocks the entire webserver from serving any other MediaWiki pages. Not just a single session. This is on CentOS 5 Linux with PHP 5.1.6 and MediaWiki 1.13.0.
- Hit a wiki article that uses my <wait> tag (previously described)
that sleeps for 20 seconds. 2. Ask another user on another PC to hit any wiki page during those 20 seconds. The wiki is unresponsive during those 20 seconds. As soon as the <wait> ends, other pages can be served.
Any idea why this would be?
Long-lived extensions are realistic, e.g., those that hit an external resource that is slow to respond.
DanB
-----Original Message----- FWIW: http://ca3.php.net/manual/en/function.session-write-close.php "Session data is usually stored after your script terminated without the need to call *session_write_close()*, but as session data is *locked* to prevent concurrent writes *only one script may operate on a session* at any time. When using framesets together with sessions you will experience the frames loading *one by one due to this locking*. You can reduce the time needed to load all the frames by ending the session as soon as all changes to session variables are done."
~Daniel Friesen
I believe it's not a webserver problem. While mediawiki is refusing to serve pages, I can hit other PHP pages on the same webserver with no delays. (See original email at http://lists.wikimedia.org/pipermail/mediawiki-l/2008-August/028302.html.)
DanB
-----Original Message----- From: Daniel Friesen Sent: Tuesday, September 02, 2008 1:17 PM To: mediawiki-l@lists.wikimedia.org Subject: Re: [Mediawiki-l] Extensions and threading - how to bring your wiki to a halt
Perhaps we should begin to look at this at a webserver level. Perhaps the webserver is blocking and can only handle one request at a time. Try testing this on a variety of webservers. Apache, lighttpd, nginx... I think Apache used worker processes, and I know that nginx does. So it may be a good idea to debug this on different numbers of worker processes.
~Daniel Friesen
Here's a summary of the problem and my findings. It's still unsolved. Any help appreciated!
Symptom: ALL requests to MW 1.13 time-out while ANY ONE user is hitting a long-running page (e.g., one with a long-running parser tag)
Theory: PHP sessions are locked
- Brion reproduced this behavior only per-session, by hitting a long-running wiki page in one browser tab and a second page in a second tab. He did not experience it for two simultaneous, separate sessions.
- I have the problem even when two separate *users* hit two MediaWiki pages from two different PCs. A long-running page prevents wiki pages from being served to the second user. So I suspect my issue is different from Brion's.
- I got the same results Brion did by writing two trivial PHP scripts that simply call session_start(), with one of them sleeping for 30 seconds. The problem is limited to MediaWiki pages.
Therefore, the problem I'm experiencing (everyone blocked) is different from the "PHP session" behavior Brion reproduced.
Ruled out:
- It seems not an Apache problem because other PHP pages (non-MediaWiki) on the same webserver are not affected. They always get served immediately, even when a long-running MediaWiki page is running.
- It's not due to a third-party extension because I've reproduced this problem on a virgin MW 1.13 install.
- Occurs for MW servers on both Windows 2003 Server and CentOS Linux with different Apache and PHP configurations.
- It's not eAccelerator, which I disabled and the problem still happened.
Stack dump when the problem occurs (MW 1.13):
Fatal error: Maximum execution time of 30 seconds exceeded in includes\db\Database.php on line 579 Stack trace: 1. {main}() index.php:0 2. MediaWiki->checkInitialQueries() index.php:60 3. Title::newMainPage() includes\Wiki.php:105 4. wfMsgForContent() includes\Title.php:293 5. wfMsgReal() includes\GlobalFunctions.php:380 6. wfMsgGetKey() includes\GlobalFunctions.php:432 7. StubObject->get() includes\GlobalFunctions.php:467 8. StubObject->__call() includes\StubObject.php:0 9. StubObject->_call() includes\StubObject.php:76 10. call_user_func_array() includes\StubObject.php:58 11. MessageCache->get() includes\StubObject.php:0 12. MessageCache->getMsgFromNamespace() includes\MessageCache.php:543 13. MessageCache->load() includes\MessageCache.php:606 14. MessageCache->saveToCaches() includes\MessageCache.php:250 15. BagOStuff->add() includes\MessageCache.php:419 16. SqlBagOStuff->set() includes\BagOStuff.php:100 17. SqlBagOStuff->delete() includes\BagOStuff.php:274 18. SqlBagOStuff->_query() includes\BagOStuff.php:288 19. MediaWikiBagOStuff->_doquery() includes\BagOStuff.php:319 20. Database->query() includes\BagOStuff.php:433 21. Database->doQuery() includes\db\Database.php:540
Any help appreciated, or advice on what to check next.
DanB
I just reproduced this behavior on a completely unrelated mediawiki site running on a different platform (Ubuntu).
Brion, when you reproduced the problem and found it was session-based, is there any chance that caching might have thrown off your results? Here's what I just noticed:
1. Open Firefox and hit a long-running page (containing <wait>300</wait> or somesuch)
2. While Firefox is spinning, open IE and hit a different page on the same wiki, "Foo". Possibly it renders quickly, as you found.
3. But... now force-refresh IE (ctrl-F5) on that same wiki page ("Foo"), or add action=purge. I find the article does NOT render until Firefox stops spinning, meaning the <wait> tag is blocking the other session too.
4. Likewise, hit some non-article like Special:SpecialPages while Firefox is spinning. Again, the page won't render until the article in Firefox does.
Could you check this on your end? Thanks, DanB
-----Original Message----- Here's a summary of the problem and my findings. It's still unsolved. Any help appreciated!
Symptom: ALL requests to MW 1.13 time-out while ANY ONE user is hitting a long-running page (e.g., one with a long-running parser tag)
Since I've reproduced this behavior on 4 wiki servers now, and it's not due to session locking as far as I can tell, I've filed it as:
https://bugzilla.wikimedia.org/show_bug.cgi?id=15460
DanB
The problem was a 1.13 bug and it's been fixed now, backported from trunk to 1.13. Any further discussion will take place in the ticket, https://bugzilla.wikimedia.org/show_bug.cgi?id=15460.
DanB
mediawiki-l@lists.wikimedia.org