Hello all,
After the new version of LabeledSectionTransclusion (LST) was deployed on
itwikisource, performance issues popped up. itwikisource's main page makes
heavy use of LST, and the new version is clearly heavier than the old one.
In this mail, I'll try to describe the aims of the new version, how the old
version worked and how the new version works.
Aims
-------
In the old situation, it was possible to transclude sections of pages by
marking them with <section> tags. However, it was impossible
to include those tags from within a template. I.e. given
page P: something before <section start='a'>something with a</section
end='a'> something after
page Q: {{#lst:P|a}}
then Q was rendered as
something with a
However, it was not possible to do something like:
page O: ===<section start='header'>{{{1}}}</section end='header'>===
page P: {{O|Some header text}}
page Q: {{#lst:P|header}}
Changes in the #lst parser
--------------------------------------
This was because in the old situation, the #lst mechanism did something
along these lines:
1) get DOM using $parser->getTemplateDom( $title ); - note that this is a
non-expanded DOM, as in templates are not expanded
2) traverse this DOM, find section tags, and call
$parser->replaceVariables(....) on the relevant sections
In the new situation, the #lst mechanism does something like:
1) get expanded wikitext using
$parser->preprocess("{{:page_to_be_transcluded}}")
2) get the DOM by calling $parser->preprocessToDom() on the expanded
wikitext
3) traverse this DOM, find section tags, and call
$parser->replaceVariables(....)
on the relevant sections (unchanged)
One obvious performance issue is that (1) and (2) are not cached - not
within one response (so if a page {{#lst}}'s the same page twice, that page
is processed twice), and not between responses (no caching).
In general, I think it would be preferrable not to do a full parse, but
just to expand the DOM of the templates. Unfortunately, I have not been
able to find a simple way to do this: PPFrame::Expand expands the templates
to their final form, not to an 'expanded DOM'.
I don't know MediaWiki caching well enough to say something about which
caches are used (or not), and what would be an effective caching strategy.
Any ideas on how to do LST without bluntly doing a full page parse for
every transcluded page, or on caching strategies, would be very welcome.
Best,
Merlijn