On 14/03/11 12:37, Daniel Friesen wrote:
Interesting. Which part specifically do you think actually caused the extreme load? Having to re-parse a large number of pages as people view them?
Yes, regular page views caused the load, after the cache was invalidated.
Did the issue show up from invalidation pre-queue, or did the issue crop up after the jobs were run?
After the jobs were run. The edits were made over a week ago.
Was this just isolated to the secure servers, ie: didn't really effect the whole cluster but was simply an issue because secure doesn't have as large a deployment as non-secure?
It affected the whole cluster. I thought I already said that. Secure is just a single server which acts as a proxy. It decrypts the requests and forwards them back to the main apache cluster. There was no problem with secure, it was reporting errors appropriately.
For now I have added a sleep() to the code to limit invalidations to 100 pages per second per job runner process.
-- Tim Starling