[Mediawiki-l] possible revision comparison optimization

Tim Lau thelastguardian at hotmail.com
Mon Mar 2 17:18:46 UTC 2009


Hello,

Thanks for the response. I did some more testing and it seems that diff3 is not the cause (the test server experiences the same slowdown whether the variable is set to false or not.) Is there an internal php routine that compares the 2 revisions when a hist/cur/undo link is clicked?

Thanks

Tim

> ------------------------------
> 
> Message: 6
> Date: Sun, 01 Mar 2009 12:58:52 +1100
> From: Tim Starling <tstarling at wikimedia.org>
> Subject: Re: [Mediawiki-l] possible revision comparison optimization
> 	with	diff3?
> To: mediawiki-l at lists.wikimedia.org
> Message-ID: <gocq4u$tst$1 at ger.gmane.org>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> tlg wrote:
> > Hello, I run a sort of semi busy wiki, and I have been experiencing
> > difficulties with its CPU load lately, with load jumping to as high as 140
> > at noon (not 1.4, not 14, but ~140). Obviously this brought the site to a
> > crawl. After investigation I have found the course- multiple diff3
> > comparisons were called at the same time.
> > 
> > To explain the cause of this needs a little background explanation. The wiki
> > I run deals with the edit of large text files. It is common to see pages
> > with hundreds of kb of pure text on any given wiki page. Normally my servers
> > would be able to handle the edit requests of these pages.
> > 
> > However, it seems that searchbots/crawlbots (from both search engines and
> > individual users) have been hitting my wiki pretty hard lately. Each of
> > these bots tries to copy all the pages, this include Revision History of
> > each of these 100kb sized wiki text pages. Since each page could have
> > potentially hundreds of edits, for every single large text files, hundreds
> > of Revision history diff (from lighttpd/apache -> php5 -> diff3? ) are
> > spawned.
> 
> diff3 is invoked in two cases: on page save when there is an edit
> conflict, and when someone clicks "undo". Neither is particularly
> vital to the operation of the wiki, so the first thing you should do
> is turn them both off, using
> 
> $wgDiff3 = false;
> 
> in LocalSettings.php. Then see if that fixes your load problems. If it
> does, then you were right about diff3 being the problem. Next you
> should look at your logs to find out where the edits or undo requests
> are coming from.
> 
> If the problem is undo requests from search engine crawlers, you could
> fix the problem by disabling anonymous edits. This will prevent the
> bots from accessing the undo link.
> 
> Please tell us what you find, because it's likely that you're not the
> only one having this problem.
> 
> -- Tim Starling



More information about the MediaWiki-l mailing list