Re: [MediaWiki-l] Troubleshooting stuck Apache requests

18 Nov 2015


      Another note if that doesn't happen to work for you: We discovered the
source of our issue by enabling profiling/debugging on the wiki (on a
non-public server/install so you can control the page loads and profiling
outputs). You can see pretty quickly what areas/code are taking the longest
and begin to dig down. Eventually I added custom profile sections to
further narrow down the issue to a single "open" call.
On 18 November 2015 at 15:57, Justin Lloyd jlloyd.wiki@gmail.com wrote:
...
Intriguing! I'll definitely investigate this and report back. Thanks! :)
Justin
On Wed, Nov 18, 2015 at 12:55 PM, Dave Humphrey dave@uesp.net wrote:
...
That actually sounds very close to an issue we had after upgrading to
1.22
...
earlier this year. Pages with a lot of images/thumbnails took a long time
to render (100s of images took over a minute). We eventually tracked it
down to having the default $wgTmpDirectory pointing to the upload/images
directory which was on a NFS share. Each file creation (or access?) on a
NFS share takes a fixed 50ms so you multiply that by multiple accesses
and
...
you get the delay.
We fixed it by simply changing $wgTmpDirectory to point to a path on the
local fixed drive. Since your setup sounds similar to ours it may be
worth
...
trying it out. If this is indeed your issue you can force a "slow" page
load by purging a page with a lot of images on it. Test it before and
after
...
the change.
On 18 November 2015 at 15:42, Justin Lloyd jlloyd.wiki@gmail.com
wrote:
...
...
My speculation is that it's image heavy pages, not one specific php
page.
...
...
This is for the Guild Wars 2 wikis, specifically the English wiki at
wiki.guildwars2.com. The Game Updates page used to be problematic,
causing
...
a massive backlog because a game update or hotfix was released and
people
...
...
hammered that page to see the list of changes. Our main editors changed
how
...
the page works, primarily breaking it up into subpages that DPL
integrates
...
the most recent of which into the main page, but also changing the
templates that were used for displaying trait and skill icons.
Further analysis of the Apache logs, after adding the %D field to the
log
...
...
format, showed a lot of pages taking sometimes minutes to complete,
which
...
...
ultimately result in 502s. The ones that appear to take the longest are
those with a lot of these thumbnail images, which is why I think it's
still
...
a template issue, but it would be really nice to be able to back up
that
...
...
hypothesis with actual data from process diagnostics, stack traces,
etc.
...
...
(I really miss DTrace on Solaris. I know it exists for Linux but I'm
wary
...
...
of trying it, especially on production systems. Anyone here have
experience
...
with it?)
On Wed, Nov 18, 2015 at 12:25 PM, Dave Humphrey dave@uesp.net wrote:
...
My usual strategy is to check server-status and if I need more detail
go
...
...
with debugging tools (gdp etc..., see
http://serverfault.com/questions/487530/find-out-what-high-cpu-usage-apache-...
...
...
...
).
It seems you have done this, however, and I'm wondering why you
haven't
...
...
at
...
least been able to narrow down the issue? You should at least be able
to
...
...
know which PHP file is locking up/crashing or the rough area/cause?
Once you know roughly where it is you can add temporary PHP logging
commands in the code to help narrow down the issue further. If you
also
...
...
...
know roughly where/how the lockups are you can try
testing/replicating
...
...
the
...
behavior to get a bit more control on it.
On 18 November 2015 at 14:59, Justin Lloyd jlloyd.wiki@gmail.com
wrote:
...
...
Hey everyone,
Yesterday I posted this to /r/mediawiki (https://redd.it/3t2apu)
and
...
...
...
...
cross-posted to /r/apache as well, but unfortunately I've still not
received any feedback other than the one request here for
clarification
...
...
and
...
a couple of suggestions on reddit that I'd already covered in the
post.
...
...
...
It's possible no one has any suggestions for me regarding this
issue
...
...
(it
...
is
...
a somewhat complex application stack that could be requiring
configuration
...
and/or tuning in multiple places, for example), but given how
severe
...
...
of a
...
...
problem this is for my production sites, I wanted to bump it once
in
...
...
...
hopes
...
of possibly getting at least some pointers of things to consider
that I
...
...
may
...
not have already, especially with respect to diagnostics I could
perform
...
on
...
the live web servers beyond just server-status and the collectd
apache
...
...
...
plugin (which is basically the same thing), for example.
On Thu, Nov 12, 2015 at 8:02 AM, Justin Lloyd <
jlloyd.wiki@gmail.com
...
...
...
...
wrote:
...
Marcin,
It's the biggest and most heavily trafficked of our wikis because
its
...
...
the
...
...
English-language version of the wiki. We also have German,
French,
...
...
and
...
...
...
Spanish, but the English-speaking community is by far the largest
and
...
...
...
most
...
active. There are some tiny configuration differences between the
wikis
...
...
...
(e.g. the value of $wgJobRunRate, the specific extensions loaded)
but
...
...
...
...
nothing very significant I don't believe.
I should also add that all four of these wikis (we have a 5th,
for
...
7
...
...
...
...
total, not 6 as I'd originally said) also use Semantic MediaWiki
extensively. I believe the other three wikis would run into the
same
...
...
...
...
problem if they had same amount of traffic as the English one.
However,
...
...
...
since they all are vhosts within the same Apache instances, the
English
...
...
...
one's problems affect all of them.
Justin
On Thu, Nov 12, 2015 at 1:42 AM, Marcin Cieslak <
saper@saper.info>
...
...
...
...
wrote:
...
> On 2015-11-12, Justin Lloyd jlloyd.wiki@gmail.com wrote:
> > * Six wikis are configured as Vhosts in Apache, load balanced
by a
...
...
...
...
> separate
> > set of front-end servers, where two of the wikis are for
private
...
...
...
...
...
> internal
> > use and the other four are public, though the traffic to one
of
...
...
the
...
...
...
> public
> > wikis dwarfs the rest and it's the wiki giving me problems.
>
> (...)
>
> > I'm mainly looking right now for how to troubleshoot the stuck
> processes,
> > but any advice regarding this architecture is also welcome,
as I
...
...
...
feel
...
it
...
> > could use some improvement but I'm not sure how just yet.
>
> The question that immediately comes to my mind before I start
digging
...
...
...
> any further - how is the wiki making problems special? Is it
just
...
...
...
...
getting
...
> most of the traffic (it is the "most interesting" one) or is its
> configuration slightly different?
>
> Marcin Cieślak
> https://www.mediawiki.org/wiki/User:Saper
>
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>

MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Dave Humphrey -- dave@uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages --
www.uesp.net
...
www.viud.net - Building the world's toughest USB drive
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Dave Humphrey -- dave@uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages --
www.uesp.net
...
www.viud.net - Building the world's toughest USB drive
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
-- 
Dave Humphrey -- dave@uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages -- www.uesp.net
www.viud.net - Building the world's toughest USB drive

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Re: [MediaWiki-l] Troubleshooting stuck Apache requests