On Fri, Apr 8, 2011 at 2:11 PM, Alex Brollo alex.brollo@gmail.com wrote:
I'd like to know something more about template parsing/caching for performance issues.
My question is: when a template is called, it's wikicode, I suppose, is parsed and translated into "something running" - I can't imagine what precisely, but I don't care so much about (so far :-) ). If a second call comes to the server for the same template, but with different parameters, the template is parsed again from scratch or something from previous parsing is used again, so saving a little bit of server load?
Currently there's not really a solid intermediate parse structure in MediaWiki (something we hope to change; I'll be ramping up some documentation for the soon-to-begin mega parser redo project soon).
Approximately speaking... In the current system, the page is preprocessed into a partial preprocessor tree which identifies certain structure boundaries (for templates and function & tag-hook extensions); templates and some hooks get expanded in, then it's all basically flattened back to wikitext. Then the main parser takes over, turning the whole wikitext document into HTML output.
I believe we do locally (in-process) cache the preprocessor structure for pages and templates, so multiple use of the same template won't incur as much preprocessor work. But, the preprocessor parsing is usually one of the fastest parts of the whole parse.
If the reply is "yes", t.i. if the "running code" of the whole template is
somehow saved and cached, ready to be used again with new parameters, perhaps it could be a good idea to build templates as "librares of different templates", using the name of the template as a "library name" and a parameter as the name of "specific function"; a simple #switch could be used to use the appropriate code of that "specific function".
I think for the most part, it'll be preferable to only have to work with the functions that are needed, rather than fetching a large number of unneeded functions at once. Even if it's pre-parsed, loading unneeded stuff means more CPU used, more memory used, more network bandwidth used.
But being able to bundle together related things as a unit that can be distributed together would be very nice, and should be considered for future work on new templating and gadget systems.
-- brion