To give some idea of what makes a difference, here are some of the things discovered over the last two weeks:
1. Storing the PHP files on NFS doubled the page load time, from around 180ms to around 360ms. So, that's no longer being done and effectively no complicated programming was needed to double performance.
2. Squid using the disk reached cache hit rates of 78% and still rising, compared to 60% without the disk, but...
3. Squid using synchronous I/O blocked on disk and would sometimes result in timeouts. So, disk was turned off for the last week.
4. The Apaches started slowing down at peak load times earlier this week, so more Squid investigation started, using asynchronous disk I/O this time. Given the past disk caching experience, this has the potential to cut the load the Apaches see by around 25-50% (they see 40% of load without disk, disk cut that to 22%).
5. There's a parameter in Squid which tells it when to ignore the disk, based on the number of file descriptors in use. If the limit is exceeded, Squid ignores the disk and just passes the request directly to the Apaches, or skips saving the page to the disk. Tuning of this parameter is currently ongoing. When set correctly it should let Squid deliver all it can from combined disk and RAM, but only up to the point where it doesn't start to block waiting for the disk.
So, there's no need to get too enthusiastic about tuning the code. The new server setup still isn't tuned fully yet... and it probably won't be before we get a nice fast database server, a second Squid and some more Apaches to spread the load around.
Bravo! Excellent analysis!
Keep in mind -- we have money in the bank, but there's probably about a 2 week lead time on getting new servers if we include time for us to puzzle over the exact needs before making another order.
We don't want to prematurely throw hardware (money!) at any problems that are really software problems, but at the same time, if we feel a need for more stuff, we can get it.
Jason is working on the migration of Bomis to here, and as that goes forward, at some point whatever Wikimedia Foundation owns in the San Diego colo will get transferred here. The older slower machines can probably be used for little stuff, or if it's the judgment of the technical staff (that's you guys) that we're better off not using it, then we can sell it, either to Bomis (but I want to be VERY careful not to raise any conflict of interest issues... the money should always flow *from* me *to* wikipedia, not the other way around, or I'm sure some jerkoff will say something) or on ebay.
--Jimbo
user_Jamesday wrote:
To give some idea of what makes a difference, here are some of the things discovered over the last two weeks:
Storing the PHP files on NFS doubled the page load time, from around 180ms to around 360ms. So, that's no longer being done and effectively no complicated programming was needed to double performance.
Squid using the disk reached cache hit rates of 78% and still rising, compared to 60% without the disk, but...
Squid using synchronous I/O blocked on disk and would sometimes result in timeouts. So, disk was turned off for the last week.
The Apaches started slowing down at peak load times earlier this week, so more Squid investigation started, using asynchronous disk I/O this time. Given the past disk caching experience, this has the potential to cut the load the Apaches see by around 25-50% (they see 40% of load without disk, disk cut that to 22%).
There's a parameter in Squid which tells it when to ignore the disk, based on the number of file descriptors in use. If the limit is exceeded, Squid ignores the disk and just passes the request directly to the Apaches, or skips saving the page to the disk. Tuning of this parameter is currently ongoing. When set correctly it should let Squid deliver all it can from combined disk and RAM, but only up to the point where it doesn't start to block waiting for the disk.
So, there's no need to get too enthusiastic about tuning the code. The new server setup still isn't tuned fully yet... and it probably won't be before we get a nice fast database server, a second Squid and some more Apaches to spread the load around.
Wikitech-l mailing list Wikitech-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org