Observing the Wikipedia moving up to a very high level of load today,
(loadav 10.39, 12.58, 13.45) it occurs to me that a "load shedding"
function would be useful, where requests may be bounced with an error
502: "*Service Temporarily Overloaded". This would have the effect of
dropping load until the server returns to normal load levels, preventing
congestion collapse.
A human-readable text should be added in the user's own language, saying
something like:
Wikipedia is experiencing very high load at the moment. We are taking
measures to control the load of the system. Please try your request
again in a few minutes when load should be lower".
Well-behaved spiders should any pages sent with 502 errors.
To prevent a sudden turn-on of error 502 for all users, with the
possibility of load oscillation, we could make the 502 errors
progressively more probable as the load increases beyond a certain
point. This could also be used to ensure that logged-on users and
"important" transactions suge as page edits maintain a higher QoS during
these periods, until load reaches the point at which even they have to
be bounced.
-- Neil
*