Re: [Wikitech-l] secure/singer proxy errors

14 Mar 2011


      On 11-03-13 06:05 PM, Tim Starling wrote:
...
On 14/03/11 11:48, William Allen Simpson wrote:
...
Secure basically fell over for awhile, generated nothing but proxy
errors.  I'm
not sure that's what really happened.  It may have been a complete
inability to
actually send or receive data, resulting in a timeout of some sort.
Take a look at the Ganglia graphs.  Free memory gone.  Big spike in
processes.
Big drop in network activity!
It was because of the CPU overload on the entire apache cluster which
occurred at that time. Secure and every other frontend proxy would
have reported errors. Domas and I traced it back to job queue cache
invalidations from an edit to [[Template:Reflist]] on the English
Wikipedia.
Note that the free memory isn't gone. RRDtool has the very
unscientific practice of starting the vertical scale at something
other than zero. It rose because processes use memory, and as you
noted, the number of processes increased. This is because they were
queueing, waiting for the overloaded backend cluster to serve them.
-- Tim Starling
Interesting.
Which part specifically do you think actually caused the extreme load?
Having to re-parse a large number of pages as people view them?
Did the issue show up from invalidation pre-queue, or did the issue crop 
up after the jobs were run?
Was this just isolated to the secure servers, ie: didn't really effect 
the whole cluster but was simply an issue because secure doesn't have as 
large a deployment as non-secure?
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] secure/singer proxy errors