(I have just accepted root on rationalwiki.org and am looking around in slight horror. I will be sending a few messages like this.)
Apache comes with KeepAlive on by default. I am unconvinced this is actually a good idea. I just switched it off and it appears to have no ill effects, and the server has 400MB more free memory. (Ubuntu 10.04 Linode with 4GB RAM. Six wikis, Lucene search being really fat.)
The only mention I can see on mediawiki.org is in http://www.mediawiki.org/wiki/Manual:Newcomers_guide_to_installing_on_Window... , where it's one of the defaults they say nothing about.
Apache connections are really pretty damn cheap these days. Is KeepAlive actually a good or bad thing for MediaWiki?
- d.
Besides the usual "it depends" most general advice on KeepAlive seems to say that it should be a small value (1 or 2) if kept on. Years ago when I was running a MediaWiki on a single server I had to turn KeepAlive completely off as I couldn't keep enough open Apache connections to satisfy the incoming connections (I would run out of memory from all the Apache sessions in keep-alive state).
I would guess that if you are short on memory then reducing KeepAlive to 1/2 or turning it off completely would be a safe bet as the memory from "alive" sessions would likely be better spent elsewhere. On the other hand, if you have a dedicated Apache server typically with plenty of free sessions/memory then leaving it on may have a small effect.
On 12 October 2012 18:22, David Gerard dgerard@gmail.com wrote:
(I have just accepted root on rationalwiki.org and am looking around in slight horror. I will be sending a few messages like this.)
Apache comes with KeepAlive on by default. I am unconvinced this is actually a good idea. I just switched it off and it appears to have no ill effects, and the server has 400MB more free memory. (Ubuntu 10.04 Linode with 4GB RAM. Six wikis, Lucene search being really fat.)
The only mention I can see on mediawiki.org is in http://www.mediawiki.org/wiki/Manual:Newcomers_guide_to_installing_on_Window... , where it's one of the defaults they say nothing about.
Apache connections are really pretty damn cheap these days. Is KeepAlive actually a good or bad thing for MediaWiki?
- d.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On 13/10/12 00:22, David Gerard wrote:
(I have just accepted root on rationalwiki.org and am looking around in slight horror. I will be sending a few messages like this.)
Apache comes with KeepAlive on by default. I am unconvinced this is actually a good idea. I just switched it off and it appears to have no ill effects, and the server has 400MB more free memory. (Ubuntu 10.04 Linode with 4GB RAM. Six wikis, Lucene search being really fat.)
The only mention I can see on mediawiki.org is in http://www.mediawiki.org/wiki/Manual:Newcomers_guide_to_installing_on_Window... , where it's one of the defaults they say nothing about.
Apache connections are really pretty damn cheap these days. Is KeepAlive actually a good or bad thing for MediaWiki?
- d.
Opening connections is expensive when compared to keep-alive. You need to open a new tcp connection (several roundtrips) for each resource (images, css, scripts). You will see the most noticeable difference with a clean cache.
The apache docs say:
In some cases this has been shown to result in an almost 50% speedup in latency times for HTML documents with many images.
-- http://httpd.apache.org/docs/2.4/mod/core.html#KeepAlive
I'd keep it on, but with a small KeepAliveTimeout
On 14 October 2012 16:34, Platonides Platonides@gmail.com wrote:
On 13/10/12 00:22, David Gerard wrote:
Apache connections are really pretty damn cheap these days. Is KeepAlive actually a good or bad thing for MediaWiki?
The apache docs say:
In some cases this has been shown to result in an almost 50% speedup in latency times for HTML documents with many images.
-- http://httpd.apache.org/docs/2.4/mod/core.html#KeepAlive I'd keep it on, but with a small KeepAliveTimeout
Mmm. Say YMMV. In our case the constraint is memory (we're fine for CPU and bandwidth), and our next Reddit-dotting not knocking us over again. Here's a graph, guess when I switched off KeepAlive: http://i49.tinypic.com/2jczzht.jpg
- d.
On 15/10/12 06:58, David Gerard wrote:
On 14 October 2012 16:34, Platonides Platonides@gmail.com wrote:
On 13/10/12 00:22, David Gerard wrote:
Apache connections are really pretty damn cheap these days. Is KeepAlive actually a good or bad thing for MediaWiki?
The apache docs say:
In some cases this has been shown to result in an almost 50% speedup in latency times for HTML documents with many images.
-- http://httpd.apache.org/docs/2.4/mod/core.html#KeepAlive I'd keep it on, but with a small KeepAliveTimeout
Mmm. Say YMMV. In our case the constraint is memory (we're fine for CPU and bandwidth), and our next Reddit-dotting not knocking us over again. Here's a graph, guess when I switched off KeepAlive: http://i49.tinypic.com/2jczzht.jpg
That's processes, not memory. I would think that the effect on memory would not be so large. Disabling keep-alive should be counted as a fairly desperate measure given the large impact on end user latencies.
I haven't ever had to run a high-traffic wiki apart from Wikipedia, but if one of my little VPS wikis got slashdotted, I think the first thing I would do is install Squid. Squid needs a tiny amount of memory per connection, maybe 1KB. You can let Squid keep the client connections open for a few minutes, but disable keepalive on Apache. That way you minimise the number of memory-guzzling Apache processes with minimal user impact. This is one of the reasons we use Squid at Wikimedia.
I think it should be possible to run both Apache and Squid on a VPS with a few hundred MB of RAM. Just have Squid on port 80 and Apache on some private firewalled port. If you have any RAM left over, you can allocate it to Squid for caching, and unless your site is enormous, you can give it whatever disk cache it needs to serve the whole site without ever contacting the backend.
The MW file cache might be fast, but Squid is always going to be faster.
-- Tim Starling
On 15 October 2012 06:04, Tim Starling tstarling@wikimedia.org wrote:
That's processes, not memory. I would think that the effect on memory would not be so large.
Memory taken appears to be processes * PHP memory_limit . Unless I'm ludicrously wrong about that.
Disabling keep-alive should be counted as a fairly desperate measure given the large impact on end user latencies.
The RW box makes me think of a Bugs Bunny cartoon steam engine, flexing from side to side, belching clouds of black smoke and making ominous noises. I'm frantically running around with gaffer tape.
Platonides' suggestion to switch KeepAlive on with a short timeout sounds worth trying, and I might do that this evening.
I haven't ever had to run a high-traffic wiki apart from Wikipedia, but if one of my little VPS wikis got slashdotted, I think the first thing I would do is install Squid.
Would Varnish achieve much the same? I'm slightly familiar with Varnish (we use it at work and it makes our much-less-stressed LAMP box quite happy with no KeepAlive.
The MW file cache might be fast, but Squid is always going to be faster.
We're quite fond of our "This article has been viewed x times" at the bottom of the pages, though I'm quite aware that's an expensive affectation that we may well have outgrown. *sigh*
- d.
On 15/10/12 20:14, David Gerard wrote:
On 15 October 2012 06:04, Tim Starling tstarling@wikimedia.org wrote:
That's processes, not memory. I would think that the effect on memory would not be so large.
Memory taken appears to be processes * PHP memory_limit . Unless I'm ludicrously wrong about that.
Memory allocated from the PHP request pool (which is limited to memory_limit) is returned to the system with munmap() after each request finishes. PHP allocates only as much memory as is required for the request. So your formula is the maximum amount of memory the processes could possibly use, it will never actually be reached.
We have noticed that memory allocated by the DOM extension persists after the request terminates, but the amount of it is not limited by PHP's memory_limit. Maybe that's what you're seeing. We reduced MaxRequestsPerChild to 4 to mitigate the effect, and I wrote this:
https://gerrit.wikimedia.org/r/#/c/23923/
Would Varnish achieve much the same? I'm slightly familiar with Varnish (we use it at work and it makes our much-less-stressed LAMP box quite happy with no KeepAlive.
Yes, I would expect so.
We're quite fond of our "This article has been viewed x times" at the bottom of the pages, though I'm quite aware that's an expensive affectation that we may well have outgrown. *sigh*
You could collect page view statistics with UDP and then incorporate them into the page with ESI or JavaScript.
-- Tim Starling
mediawiki-l@lists.wikimedia.org