Don't enable debug/profiling on a server that gets public hits, at least in
general. Profiling slows down the page load and if you're already
borderline it will make a bad situation worse. That and you'll generate so
much profiling output it will be hard to go through and make anything
meaningful from it.
Put it on a unused server or copy the wiki installation to a new folder and
do it from there. Then you can control the page loads being profiled.
On 19 November 2015 at 12:52, Justin Lloyd <jlloyd.wiki(a)gmail.com> wrote:
So I confirmed that $wgTmpDirectory is defaulting to
/tmp on my systems, so
that's not the problem. As for profiling, I enabled it on one of the four
web servers and that appears to actually trigger the problem, or a very
similar one. The Apache processes quickly climbed to their MaxClients limit
of 100 and just stayed there, forcing me to restart Apache after first
commenting out the profiling settings in LocalSettings.php, where
$wgProfileLimit was set to 2.
On Wed, Nov 18, 2015 at 1:00 PM, Dave Humphrey <dave(a)uesp.net> wrote:
Another note if that doesn't happen to work
for you: We discovered the
source of our issue by enabling profiling/debugging on the wiki (on a
non-public server/install so you can control the page loads and profiling
outputs). You can see pretty quickly what areas/code are taking the
longest
and begin to dig down. Eventually I added custom
profile sections to
further narrow down the issue to a single "open" call.
On 18 November 2015 at 15:57, Justin Lloyd <jlloyd.wiki(a)gmail.com>
wrote:
> Intriguing! I'll definitely investigate this and report back. Thanks!
:)
Justin
On Wed, Nov 18, 2015 at 12:55 PM, Dave Humphrey <dave(a)uesp.net> wrote:
That actually sounds very close to an issue we
had after upgrading to
1.22
> earlier this year. Pages with a lot of images/thumbnails took a long
time
> > to render (100s of images took over a minute). We eventually tracked
it
> down
to having the default $wgTmpDirectory pointing to the
upload/images
> > directory which was on a NFS share. Each file creation (or access?)
on
a
> > NFS share takes a fixed 50ms so you multiply that by multiple
accesses
and
> you get the delay.
>
> We fixed it by simply changing $wgTmpDirectory to point to a path on
the
> > local fixed drive. Since your setup sounds similar to ours it may be
> worth
> > trying it out. If this is indeed your issue you can force a "slow"
page
> > load by purging a page with a lot of
images on it. Test it before and
> after
> > the change.
> >
> > On 18 November 2015 at 15:42, Justin Lloyd <jlloyd.wiki(a)gmail.com>
> wrote:
> >
> > > My speculation is that it's image heavy pages, not one specific php
> page.
> > > This is for the Guild Wars 2 wikis, specifically the English wiki
at
>
wiki.guildwars2.com. The Game Updates page used
to be problematic,
causing
> a massive backlog because a game update or hotfix was released and
people
> > hammered that page to see the list of changes. Our main editors
changed
> > how
> > > the page works, primarily breaking it up into subpages that DPL
> > integrates
> > > the most recent of which into the main page, but also changing the
> > > templates that were used for displaying trait and skill icons.
> > >
> > > Further analysis of the Apache logs, after adding the %D field to
the
log
> format, showed a lot of pages taking
sometimes minutes to complete,
which
> > ultimately result in 502s. The ones that appear to take the longest
are
> > > those with a lot of these thumbnail images, which is why I think
it's
> > still
> > > a template issue, but it would be really nice to be able to back up
> that
> > > hypothesis with actual data from process diagnostics, stack traces,
> etc.
> > >
> > > (I really miss DTrace on Solaris. I know it exists for Linux but
I'm
wary
> > of trying it, especially on production systems. Anyone here have
> experience
> > with it?)
> >
> >
> > On Wed, Nov 18, 2015 at 12:25 PM, Dave Humphrey <dave(a)uesp.net>
wrote:
> >
> > > My usual strategy is to check server-status and if I need more
detail
go
> > with debugging tools (gdp etc..., see
> >
> >
>
http://serverfault.com/questions/487530/find-out-what-high-cpu-usage-apache…
> > ).
> > It seems you have done this, however, and I'm wondering why you
haven't
> > at
> > > least been able to narrow down the issue? You should at least be
able
> > to
> > > > know which PHP file is locking up/crashing or the rough
area/cause?
> > > >
> > > > Once you know roughly where it is you can add temporary PHP
logging
> > > > commands in the code to help
narrow down the issue further. If
you
> also
> > > > know roughly where/how the lockups are you can try
> testing/replicating
> > > the
> > > > behavior to get a bit more control on it.
> > > >
> > > > On 18 November 2015 at 14:59, Justin Lloyd <
jlloyd.wiki(a)gmail.com>
> > > wrote:
> > > >
> > > > > Hey everyone,
> > > > >
> > > > > Yesterday I posted this to /r/mediawiki (
https://redd.it/3t2apu)
and
> > > > cross-posted to /r/apache as well, but unfortunately I've still
not
> > > > > received any feedback other than the one request here for
> > clarification
> > > > and
> > > > > a couple of suggestions on reddit that I'd already covered
in
the
> > post.
> > > > >
> > > > > It's possible no one has any suggestions for me regarding
this
> issue
> > > (it
> > > > is
> > > > > a somewhat complex application stack that could be requiring
> > > > configuration
> > > > > and/or tuning in multiple places, for example), but given how
> severe
> > > of a
> > > > > problem this is for my production sites, I wanted to bump it
once
> in
> > > > hopes
> > > > > of possibly getting at least some pointers of things to
consider
> > that I
> > > > may
> > > > > not have already, especially with respect to diagnostics I
could
> perform
> > on
> > > the live web servers beyond just server-status and the collectd
apache
> > > plugin (which is basically the same thing), for example.
> > >
> > >
> > >
> > > On Thu, Nov 12, 2015 at 8:02 AM, Justin Lloyd <
jlloyd.wiki(a)gmail.com
> >
> > > > wrote:
> > > >
> > > > > Marcin,
> > > > >
> > > > > It's the biggest and most heavily trafficked of our wikis
because
its
> > the
> > > > English-language version of the wiki. We also have German,
French,
> > and
> > > > > Spanish, but the English-speaking community is by far the
largest
> and
> > > > most
> > > > > active. There are some tiny configuration differences between
the
> > wikis
> > > > > (e.g. the value of $wgJobRunRate, the specific extensions
loaded)
> > but
> > > > > > nothing very significant I don't believe.
> > > > > >
> > > > > > I should also add that all four of these wikis (we have a
5th,
for
> 7
> > > > > total, not 6 as I'd originally said) also use Semantic
MediaWiki
> > > > > > extensively. I believe the other three wikis would run into
the
> > same
> > > > > > problem if they had same amount of traffic as the English
one.
> > > However,
> > > > > > since they all are vhosts within the same Apache instances,
the
> English
> > > > one's problems affect all of them.
> > > >
> > > > Justin
> > > >
> > > >
> > > > On Thu, Nov 12, 2015 at 1:42 AM, Marcin Cieslak <
saper(a)saper.info>
> > > > wrote:
> > > > >
> > > > >> On 2015-11-12, Justin Lloyd <jlloyd.wiki(a)gmail.com>
wrote:
> > > > >> > * Six wikis are configured as Vhosts in Apache, load
balanced
> > by a
> > > > > >> separate
> > > > > >> > set of front-end servers, where two of the wikis
are for
> private
> > > > > >> internal
> > > > > >> > use and the other four are public, though the
traffic to
one
> of
> > > the
> > > > > >> public
> > > > > >> > wikis dwarfs the rest and it's the wiki giving
me
problems.
> >
> > >>
> > > > >> (...)
> > > > >>
> > > > >> > I'm mainly looking right now for how to
troubleshoot the
stuck
> > > > > >> processes,
> > > > > >> > but any advice regarding this architecture is also
welcome,
> as I
> > > > feel
> > > > > it
> > > > > >> > could use some improvement but I'm not sure
how just yet.
> > > > > >>
> > > > > >> The question that immediately comes to my mind before I
start
> digging
> > > >> any further - how is the wiki making problems special? Is it
just
> > > > getting
> > > > >> most of the traffic (it is the "most interesting"
one) or is
its
> >
> >> configuration slightly different?
> > > >>
> > > >> Marcin Cieślak
> > > >>
https://www.mediawiki.org/wiki/User:Saper
> > > >>
> > > >>
> > > >> _______________________________________________
> > > >> MediaWiki-l mailing list
> > > >> To unsubscribe, go to:
> > > >>
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
> > > >>
> > > >
> > > >
> > > _______________________________________________
> > > MediaWiki-l mailing list
> > > To unsubscribe, go to:
> > >
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
> > >
> >
> >
> >
> > --
> > Dave Humphrey -- dave(a)uesp.net
> > Founder/Server Admin of the Unofficial Elder Scrolls Pages --
>
www.uesp.net
> >
www.viud.net - Building the world's toughest USB drive
> > _______________________________________________
> > MediaWiki-l mailing list
> > To unsubscribe, go to:
> >
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
> >
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
>
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
--
Dave Humphrey -- dave(a)uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages --
www.uesp.net
www.viud.net - Building the world's toughest
USB drive
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Dave Humphrey -- dave(a)uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages --
www.uesp.net
www.viud.net - Building the world's toughest
USB drive
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Dave Humphrey -- dave(a)uesp.net
Founder/Server Admin of the Unofficial Elder Scrolls Pages --