I have an extension which parses the contents of a page to store the content of certain embedded tags to the database, and I want the parsing to take place after the pre-processing (comment removal, template expansion, etc.) I also need the code to be compatible with MW1.6 as I am currently unable to upgrade to PHP5 (hopefully soon...)
Here is the code I was using until recently (where $Text is the unmodified article text):
// Create new Parser object to deal with some transformations that are // required before saving. $Parser = new Parser();
// Use the Parser object to strip out html comments, nowiki and pre tags // and whatever other bits shouldn't make it through when rendering (so // they don't affect saving). $ParserOptions = new ParserOptions(); $StripState =& $Parser->mStripState; $Parser->mOptions = $ParserOptions; $TidyText = $Parser->strip($Text, $StripState, true);
// Then replace any variables, parser functions etc. so that 'hidden' tags // (e.g. tags that are created by code, such as using the ExpandAfter // extension) are expanded properly for saving. $Parser->mFunctionHooks = $wgParser->mFunctionHooks; $Parser->mTitle =& $wgParser->mTitle; $TidyText = $Parser->replaceVariables($TidyText);
However, I was recently testing this on MW1.12, and this gives the following error:
Fatal error: Call to a member function matchStartToEnd() on a non-object in Parser.php on line 2771
I fixed this by inserting the following two lines just before the second $TidyText = ...
$Parser->mVariables =& $wgParser->mVariables; $Parser->mOutput =& $wgParser->mOutput;
Now, it is clear to me that this is the wrong way of going about this - I shouldn't be having to mess with the internals of the parser object in order to just pre-process the text, as it will clearly break whenever the parser object is updated!
Can someone tell me the correct forward-compatible way to pre-process article text in this manner?
- Mark Clements (HappyDog).