On 14/03/11 11:48, William Allen Simpson wrote:
Secure basically fell over for awhile, generated
nothing but proxy
errors. I'm
not sure that's what really happened. It may have been a complete
inability to
actually send or receive data, resulting in a timeout of some sort.
Take a look at the Ganglia graphs. Free memory gone. Big spike in
processes.
Big drop in network activity!
It was because of the CPU overload on the entire apache cluster which
occurred at that time. Secure and every other frontend proxy would
have reported errors. Domas and I traced it back to job queue cache
invalidations from an edit to [[Template:Reflist]] on the English
Wikipedia.
Note that the free memory isn't gone. RRDtool has the very
unscientific practice of starting the vertical scale at something
other than zero. It rose because processes use memory, and as you
noted, the number of processes increased. This is because they were
queueing, waiting for the overloaded backend cluster to serve them.
-- Tim Starling