Re: [Wikitech-l] Job queue on Wiki Farms

16 Nov 2010


      On 10-11-16 08:51 AM, Roan Kattouw wrote:
...
2010/11/15 Daniel Friesenlists@nadir-seen-fire.com:
...
There was a thought about the job queue that popped into my mind today.
From what I understand, for a Wiki Farm, in order to use runJobs.php
instead of using the in-request queue (which on high traffic sites is
less desireable) the Wiki Farm has to run runJobs.php periodically for
each and every wiki on the farm.
So, for example. If a Wiki Farm has 10,000 wiki it's hosting, say the
Wiki Host really wants to ensure that the queue is run at least hourly
to keep the data on the wiki reasonably up to date, the wiki farm
essentially needs to call runJobs.php 10,000 times an hour (ie: one time
for each individual wiki), irrelevantly of whether a wiki has jobs or
not. Either that or poll each database before hand, which in itself is
10,000 database calls an hour plus the runJobs execution which still
isn't that desireable.
Have you considered the fact that the WMF cluster is in this exact situation? ;)
However, we don't call runJobs.php for all wikis periodically.
Instead, we call nextJobDB.php which generates a list of wikis that
have pending jobs (by connecting to all of their DBs), caches it in
memcached (caching was broken until a few minutes ago, oops) and
outputs a random DB name. We then run runJobs.php on that random DB
name. This whole thing is in maintenance/jobs-loop.sh
Roan Kattouw (Catrope)
Ok, then...
How many databases are in the cluster being served by nextJobDB?
How long does it take to connect to all the databases and figure out 
what ones have pending jobs?
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
-- 
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Job queue on Wiki Farms