[Wikimedia-l] Fwd: [Wikitech-l] #switch limits

Steven Walling steven.walling at gmail.com
Fri Sep 21 04:14:59 UTC 2012


Template authors on any and every wiki, this one's for you. ;)

---------- Forwarded message ----------
From: Tim Starling <tstarling at wikimedia.org>
Date: Thu, Sep 20, 2012 at 9:07 PM
Subject: [Wikitech-l] #switch limits
To: wikitech-l at lists.wikimedia.org


Over the last week, we have noticed very heavy apache memory usage on
the main Wikimedia cluster. In some cases, high memory usage resulted
in heavy swapping and site-wide performance issues.

After some analysis, we've identified the main cause of this high
memory usage to be geographical data ("données") templates on the
French Wikipedia, and to a lesser extent, the same data templates
copied to other wikis for use on articles about places in Europe.

Here is an example of a problematic template:

<
https://fr.wikipedia.org/w/index.php?title=Mod%C3%A8le:Donn%C3%A9es_PyrF1-2009&action=edit
>

That template alone uses 47MB for 37000 #switch cases, and one article
used about 15 similarly sized templates.

The simplest solution to this problem is for the few Wikipedians
involved to stop doing what they are doing, and to remove the template
invocations which have already been introduced. Antoine Musso has
raised the issue on the French Wikipedia's "Bistro" and some of the
worst cases have already been fixed.

To protect site stability, I've introduced a new preprocessor
complexity limit called the "preprocessor generated node count", which
is incremented by about 6 for each #switch case. When the limit is
exceeded, an exception is thrown, preventing the page from being saved
or viewed.

The limit is currently 4 million (~667,000 #switch cases), and it will
soon be reduced to 1.5 million (~250,000 #switch cases). That's a
compromise which allows most of the existing geographical pages to
keep working, but still allows a memory usage of about 230MB.

At some point, we would like to patch PHP upstream to cause memory for
DOM XML trees to be allocated from the PHP request pool, instead of
with malloc(). But to deploy that, we would need to reduce the limit
to the point where the template DOM cache can easily fit in the PHP
memory limit of 128MB.

In the short term, we will be working with the template editors to
ensure that all articles can be viewed with a limit of 1.5 million.
That's not a very viable solution in the long term, so I'd also like
to introduce save-time warnings and tracking categories for pages
which use more than, say, 50% of the limit, to encourage authors to
fix articles without being directly prompted by WMF staff members.

At some point in the future, you may be able to put this kind of
geographical data in Wikidata. Please, template authors, wait
patiently, don't implement your own version of Wikidata using wikitext
templates.

-- Tim Starling



_______________________________________________
Wikitech-l mailing list
Wikitech-l at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


More information about the Wikimedia-l mailing list