On Fri, Apr 8, 2011 at 2:11 PM, Alex Brollo <alex.brollo(a)gmail.com> wrote:
I'd like to know something more about template
parsing/caching for
performance issues.
My question is: when a template is called, it's wikicode, I suppose, is
parsed and translated into "something running" - I can't imagine what
precisely, but I don't care so much about (so far :-) ). If a second call
comes to the server for the same template, but with different parameters,
the template is parsed again from scratch or something from previous
parsing
is used again, so saving a little bit of server load?
Currently there's not really a solid intermediate parse structure in
MediaWiki (something we hope to change; I'll be ramping up some
documentation for the soon-to-begin mega parser redo project soon).
Approximately speaking... In the current system, the page is preprocessed
into a partial preprocessor tree which identifies certain structure
boundaries (for templates and function & tag-hook extensions); templates and
some hooks get expanded in, then it's all basically flattened back to
wikitext. Then the main parser takes over, turning the whole wikitext
document into HTML output.
I believe we do locally (in-process) cache the preprocessor structure for
pages and templates, so multiple use of the same template won't incur as
much preprocessor work. But, the preprocessor parsing is usually one of the
fastest parts of the whole parse.
If the reply is "yes", t.i. if the "running code" of the whole
template is
somehow saved and cached, ready to be used again with
new parameters,
perhaps it could be a good idea to build templates as "librares of
different
templates", using the name of the template as a "library name" and a
parameter as the name of "specific function"; a simple #switch could be
used
to use the appropriate code of that "specific function".
I think for the most part, it'll be preferable to only have to work with the
functions that are needed, rather than fetching a large number of unneeded
functions at once. Even if it's pre-parsed, loading unneeded stuff means
more CPU used, more memory used, more network bandwidth used.
But being able to bundle together related things as a unit that can be
distributed together would be very nice, and should be considered for future
work on new templating and gadget systems.
-- brion