Cloning the parser?

List overview All Threads
Download

newer

older

Questions to Quiz extension

Automatically add template on...

Daniel Barrett

8 Sep 2010 8 Sep '10

9:07 p.m.

What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.

Application: I have written an parser function extension that, internally, needs to parse several other wiki pages to complete its task. When I do this with the Parser object supplied to my hook function, there are all kinds of unwanted side-effects. For example, the ParserOutput's category links get applied to the current article, which I don't want, but if I delete them (e.g., $parserOutput->setCategoryLinks(array())), this causes worse problems. So I'd rather use a fresh new Parser, except it doesn't have all the tag extensions & parser functions initialized, and probably a bunch of other things missing too....

Thanks, DanB

Show replies by date

Chad

9 Sep 9 Sep

12:42 a.m.

New subject: [Mediawiki-l] Cloning the parser?

On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:

...

What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.

$myParser = clone $wgParser;

Or am I missing something here?

-Chad

Tim Starling

5:08 a.m.

New subject: [Mediawiki-l] Cloning the parser?

On 09/09/10 08:42, Chad wrote:

...

On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:

...
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.

$myParser = clone $wgParser;

Or am I missing something here?

I'm afraid so. That's basically the same as making a new parser object, and then assigning each of the member variables in turn. So the object members (preprocessor, link holders, strip state, etc.), end up being handles to the same objects, which means that when you call the two parsers, they interfere with each other.

Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.

Previously I attempted to support extensions which want to clone the parser and then call it without calling clearState(), but I eventually gave up on that idea on the basis that it's unmaintainable. So now the options are clearState() or find some other way to do what it is you're doing.

On 09/09/10 05:07, Daniel Barrett wrote:

...

What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.

$myParser = clone $wgParser; $myParser->preprocess(...); $myParser->parse(...);

Only call the entry points listed in the Parser class doc comment.

NOT:

$myParser->replaceVariables(...);

or something like that. If you're really desperate you can call Parser::startExternalParse() followed by some non-entry-point function.

-- Tim Starling

Juliano F. Ravasi

5:30 a.m.

New subject: [Mediawiki-l] Cloning the parser?

On 09-09-2010 00:08, Tim Starling wrote:

...

Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.

This is good to know. I have a similar problem in Extension:Wikilog. It needs a separate parser to generate Atom/RSS feeds, which have several restrictions (all URLs must be absolute, TOCs must not be generated, etc.).

Currently, it has two options: either cloning $wgParser (the default) or creating a new Parser object. The problem with the second option is that there are way too many broken extensions that initialize $wgParser directly instead of the $parser parameter from the ParserFirstCallInit hook.

With this information now, I may consider settling with a $wgParser clone to satisfy these needs.

Regards,

-- Juliano F. Ravasi ·· http://juliano.info/ 5105 46CC B2B7 F0CD 5F47 E740 72CA 54F4 DF37 9E96

Chad

2:49 p.m.

New subject: [Mediawiki-l] Cloning the parser?

On Wed, Sep 8, 2010 at 11:08 PM, Tim Starling tstarling@wikimedia.org wrote:

...

On 09/09/10 08:42, Chad wrote:

...
On Wed, Sep 8, 2010 at 3:07 PM, Daniel Barrett danb@vistaprint.com wrote:

...
What is the simplest, correct way to create a new Parser object with the same initialization as the current Parser object (e.g., $wgParser)? An actual code fragment would be great.

$myParser = clone $wgParser;

Or am I missing something here?

I'm afraid so. That's basically the same as making a new parser object, and then assigning each of the member variables in turn. So the object members (preprocessor, link holders, strip state, etc.), end up being handles to the same objects, which means that when you call the two parsers, they interfere with each other.

Parser::clearState() should be enough to fix this (or calling some parser method that calls clearState()), and that's what MessageCache::transform() relies on when it clones the parser. Parser::clearState() has some special hacks in it to clean up after a clone.

Previously I attempted to support extensions which want to clone the parser and then call it without calling clearState(), but I eventually gave up on that idea on the basis that it's unmaintainable. So now the options are clearState() or find some other way to do what it is you're doing.

Is there a reason __clone() couldn't (shouldn't?) call clearState()?

-Chad

Tim Starling

10 Sep 10 Sep

5:15 a.m.

New subject: [Mediawiki-l] Cloning the parser?

On 09/09/10 22:49, Chad wrote:

...

Is there a reason __clone() couldn't (shouldn't?) call clearState()?

I think that would count as "surprising" (as in the principle of least surprise). Coders would not expect "clone $parser" to clear the state in the new clone. It's conceivable that it could introduce bugs. Also clearState() is fairly slow, so you wouldn't want to call it more times than necessary.

Something like Parser::cloneAndClear() could be documented to return a clone of $this and simultaneously clear the state, that would be somewhat less likely to surprise.

-- Tim Starling

5217

Age (days ago)

5219

Last active (days ago)

mediawiki-l@lists.wikimedia.org

5 comments

4 participants

tags (0)

participants (4)

Chad
Daniel Barrett
Juliano F. Ravasi
Tim Starling