I have an extension which parses the contents of a page to store the content
of certain embedded tags to the database, and I want the parsing to take
place after the pre-processing (comment removal, template expansion, etc.)
I also need the code to be compatible with MW1.6 as I am currently unable to
upgrade to PHP5 (hopefully soon...)
Here is the code I was using until recently (where $Text is the unmodified
article text):
// Create new Parser object to deal with some transformations that are
// required before saving.
$Parser = new Parser();
// Use the Parser object to strip out html comments, nowiki and pre tags
// and whatever other bits shouldn't make it through when rendering (so
// they don't affect saving).
$ParserOptions = new ParserOptions();
$StripState =& $Parser->mStripState;
$Parser->mOptions = $ParserOptions;
$TidyText = $Parser->strip($Text, $StripState, true);
// Then replace any variables, parser functions etc. so that 'hidden' tags
// (e.g. tags that are created by code, such as using the ExpandAfter
// extension) are expanded properly for saving.
$Parser->mFunctionHooks = $wgParser->mFunctionHooks;
$Parser->mTitle =& $wgParser->mTitle;
$TidyText = $Parser->replaceVariables($TidyText);
However, I was recently testing this on MW1.12, and this gives the following
error:
Fatal error: Call to a member function matchStartToEnd() on a non-object
in Parser.php on line 2771
I fixed this by inserting the following two lines just before the second
$TidyText = ...
$Parser->mVariables =& $wgParser->mVariables;
$Parser->mOutput =& $wgParser->mOutput;
Now, it is clear to me that this is the wrong way of going about this - I
shouldn't be having to mess with the internals of the parser object in order
to just pre-process the text, as it will clearly break whenever the parser
object is updated!
Can someone tell me the correct forward-compatible way to pre-process
article text in this manner?
- Mark Clements (HappyDog).