What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.
Application: I have written an parser function extension that, internally, needs to parse several other wiki pages to complete its task. When I do this with the Parser object supplied to my hook function, there are all kinds of unwanted side-effects. For example, the ParserOutput's category links get applied to the current article, which I don't want, but if I delete them (e.g., $parserOutput->setCategoryLinks(array())), this causes worse problems. So I'd rather use a fresh new Parser, except it doesn't have all the tag extensions & parser functions initialized, and probably a bunch of other things missing too....
Thanks, DanB
On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.
$myParser = clone $wgParser;
Or am I missing something here?
-Chad
On 09/09/10 08:42, Chad wrote:
On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.
$myParser = clone $wgParser;
Or am I missing something here?
I'm afraid so. That's basically the same as making a new parser object, and then assigning each of the member variables in turn. So the object members (preprocessor, link holders, strip state, etc.), end up being handles to the same objects, which means that when you call the two parsers, they interfere with each other.
Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.
Previously I attempted to support extensions which want to clone the parser and then call it without calling clearState(), but I eventually gave up on that idea on the basis that it's unmaintainable. So now the options are clearState() or find some other way to do what it is you're doing.
On 09/09/10 05:07, Daniel Barrett wrote:
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.
$myParser = clone $wgParser; $myParser->preprocess(...); $myParser->parse(...);
Only call the entry points listed in the Parser class doc comment.
NOT:
$myParser->replaceVariables(...);
or something like that. If you're really desperate you can call Parser::startExternalParse() followed by some non-entry-point function.
-- Tim Starling
On 09-09-2010 00:08, Tim Starling wrote:
Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.
This is good to know. I have a similar problem in Extension:Wikilog. It needs a separate parser to generate Atom/RSS feeds, which have several restrictions (all URLs must be absolute, TOCs must not be generated, etc.).
Currently, it has two options: either cloning $wgParser (the default) or creating a new Parser object. The problem with the second option is that there are way too many broken extensions that initialize $wgParser directly instead of the $parser parameter from the ParserFirstCallInit hook.
With this information now, I may consider settling with a $wgParser clone to satisfy these needs.
Regards,
On Wed, Sep 8, 2010 at 11:08 PM, Tim Starling tstarling@wikimedia.org wrote:
On 09/09/10 08:42, Chad wrote:
On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.
$myParser = clone $wgParser;
Or am I missing something here?
I'm afraid so. That's basically the same as making a new parser object, and then assigning each of the member variables in turn. So the object members (preprocessor, link holders, strip state, etc.), end up being handles to the same objects, which means that when you call the two parsers, they interfere with each other.
Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.
Previously I attempted to support extensions which want to clone the parser and then call it without calling clearState(), but I eventually gave up on that idea on the basis that it's unmaintainable. So now the options are clearState() or find some other way to do what it is you're doing.
Is there a reason __clone() couldn't (shouldn't?) call clearState()?
-Chad
On 09/09/10 22:49, Chad wrote:
Is there a reason __clone() couldn't (shouldn't?) call clearState()?
I think that would count as "surprising" (as in the principle of least surprise). Coders would not expect "clone $parser" to clear the state in the new clone. It's conceivable that it could introduce bugs. Also clearState() is fairly slow, so you wouldn't want to call it more times than necessary.
Something like Parser::cloneAndClear() could be documented to return a clone of $this and simultaneously clear the state, that would be somewhat less likely to surprise.
-- Tim Starling
mediawiki-l@lists.wikimedia.org