Inclusion request for the Validator extension

List overview All Threads
Download

newer

older

Parser postprocessor

Deprecating content_actions

Jeroen De Dauw

11 Jan 2011 11 Jan '11

4:35 p.m.

Hey,

Over the past year I've been working on an extension to facilitate parameter handling in MediaWiki, with a focus on parser hooks. It's titled Validator [0], which currently is a bit misleading since it enables a lot more then simple validation. As the only thing this extension does is facilitate parameter handling in other extensions, I think it makes sense to include it into core, or at least in the default MediaWiki distribution.

I created this extension out of frustration as an extension developer that to create a parser hook, you need to do the same plumbing over and over again, and have to write a whole mess of parsing and validation code that is similar for almost every parser hook. Of course this is doable, but it's error prone, causes small differences in how exactly parameters are handled in different parser hooks (not very nice for the end users), and is hard to maintain. If you have done this a few times, it becomes rather obvious that a more generic framework to handle parameters would be a big win. You want to describe a parameter, not all the details of how it should be handled.

Using Validator is somewhat similar to how API classes and their getParameters methods work, but more powerful and extendible. A parser hook can be created by deriving from the ParserHook class added by Validator, and implementing or overriding some methods to specify the name, aliases (if any), parameters (and all their meta data) and actual handling function of the parser hook. This last method gets called with a list of all parameters handled by Validator, and in most cases won't need any extra work. This ParserHook class is just a wrapper around creating parser functions and tag extensions and using the actual validation class of validator. You can directly handle parameters using the Validator class. A nice example of ParserHook usage can be found in the SubPageList extension [1]. The Maps and Semantic Maps extensions also use Validator, and contain more complex examples (with parameters dependent on others, ect) that implement the display_map and display_points parser hooks [2].

Putting this functionality in core would be a big help to everyone writing new parameter handling code, and would enable cleaning up existing implementations where this is desired. As this functionality does not replace anything in core, putting it in would not disrupt anything. I'm willing to do the little work required to merge Validator into core.

I could just do that right now of course, but I'm quite sure some people would object to that. So can these people please respond to this email in some constructive manner, so this request does not simply get ignored for no good reason?

[0] https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Extension:Validator [1] https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Extension:SubPageList [2] http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/Maps/includes/par...

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

Show replies by date

Platonides

12 Jan 12 Jan

12:23 a.m.

Well, I just had an issue with Validator, so I am not too sympatatic with your extension right now ;) After grepping for setHook, it turns out that an extension like Maps, that has zero matches, sets parser hooks indirectly via Validator extension. And not only that, but it also sets a hook for a different name. It seems to set a hook for <display_map> but actually sets it for <display map> (Why??) so that even looking for the full tag name doesn't give you any result.

It makes things more complex. On the maps case, the parameters need a deal more validation, but for most cases $parser->setHook() is clearer than registering a hook to class::staticInit() for a class which extends ParserHook and has some functions returning the hook configuration.

Jeroen De Dauw

8:51 p.m.

Hey,

...

After grepping for setHook, it turns out that an extension like Maps, that

has zero matches, sets parser hooks indirectly via Validator extension.

Indeed. Innovation cannot happen without change. I'd argue that deriving from the ParserHook class makes it easier to find parser hooks, but this has quite little to do with my inclusion request.

...

It seems to set a hook for <display_map> but actually sets it for <display

map> (Why??)

It sets {{#display_map}} and <display map>. That's the desired behaviour of this parser hook.

...

It makes things more complex.

It makes creating parser hooks easier by hiding most of the complexity that comes with handling their parameters. How does it make things more complex?! The code itself is of course as complex as it needs to be to handle various use cases. Arguing against that is the same as arguing against the Html, Http and similar classes, and is quite absurd IMO.

As for your criticism on how I choose to register ParserHook classes: due to limitations in the current version of PHP I could not fully do what I wanted there, and settled for the most programmatic solution I saw. If you have a better approach, feel very free to share it with me, or fix it yourself. This really is just an implementation detail, and again has little to do with my inclusion request.

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

Brion Vibber

10:43 p.m.

On Wed, Jan 12, 2011 at 3:51 PM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...

Hey,

...
After grepping for setHook, it turns out that an extension like Maps,

that has zero matches, sets parser hooks indirectly via Validator extension.

Indeed. Innovation cannot happen without change. I'd argue that deriving from the ParserHook class makes it easier to find parser hooks, but this has quite little to do with my inclusion request.

...
It seems to set a hook for <display_map> but actually sets it for

<display map> (Why??)

It sets {{#display_map}} and <display map>. That's the desired behaviour of this parser hook.

I think <display map> would be parsed as the tag hook "display" with a parameter "map"="map". Would this prevent any use of the hook registered as the tag "display map"...?

-- brion

Jeroen De Dauw

10:53 p.m.

Hey,

...

I think <display map> would be parsed as the tag hook "display" with a

parameter "map"="map". Would this prevent any use of the hook registered as the tag "display map"...?

This code has been there for a few months now, and appears to be working as expected. The tag extension gets handled properly and no "map" parameter is passed along.

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

Brion Vibber

11:06 p.m.

On Wed, Jan 12, 2011 at 5:53 PM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...

Hey,

...
I think <display map> would be parsed as the tag hook "display" with a

parameter "map"="map". Would this prevent any use of the hook registered as the tag "display map"...?

This code has been there for a few months now, and appears to be working as expected. The tag extension gets handled properly and no "map" parameter is passed along.

That seems highly contrary to expectations when dealing with tag hooks. What if there's a tag hook "display" registered by another plugin, does it get overridden? Overridden only if a "map" parameter is supplied?

-- brion

Jeroen De Dauw

11:13 p.m.

Hey,

...

That seems highly contrary to expectations when dealing with tag hooks.

What if there's a tag hook "display" registered by another plugin, does it get overridden? Overridden only if a "map" parameter is supplied?

That's a good point, which I did not consider when naming the hook like this. Sorting the tag extensions desc by amount of "words" in their name and then handling them in that order should prevent anything from getting overridden no? In any case, I'll have a closer look at this for the next version of the extension.

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

Brion Vibber

11:27 p.m.

On Wed, Jan 12, 2011 at 6:13 PM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...

...
That seems highly contrary to expectations when dealing with tag hooks.

What if there's a tag hook "display" registered by another plugin, does it get overridden? Overridden only if a "map" parameter is supplied?

That's a good point, which I did not consider when naming the hook like this. Sorting the tag extensions desc by amount of "words" in their name and then handling them in that order should prevent anything from getting overridden no? In any case, I'll have a closer look at this for the next version of the extension.

It looks like Parser::setHook() and Parser::setTransparentTagHook() don't do any validation on tag hook names. At least the search for transparent tag hooks is very naive, assuming that only valid names were passed and knowing that valid names contain only characters which carry no special meaning in regex syntax:

$elements = array_keys( $this->mTransparentTagHooks ); $text = $this->extractTagsAndParams( $elements, $text, $matches, $uniq_prefix );

... in which happens...

$taglist = implode( '|', $elements ); $start = "/<($taglist)(\s+[^>]*?|\s*?)(/?" . ">)|<(!--)/i";

This would explain why you don't get an error thrown when setting the tag hook with a space in the name (parser assumes extensions are acting correctly, and doesn't validate), and why the incorrect entry gets "parsed" as it does (the code that extracts them assumes valid input, so uses a string match that's convenient to write when those assumptions hold).

I haven't dared look at the preprocessor code in case horrible things happen there too. :)

Platonides wrote:

...

It is order dependant. If you install the extension providing 'display', then the one providing 'display map', the latter will never be called. If 'display map' gets registered first, both will work (your users might get annoyed, though).

Guess what I tried to do in r80025? :)

I'd recommend whitelisting legit characters rather than blacklisting a handful of illegit characters; tag hook name validation should probably sensibly match the rules of XML element naming, as that's what the syntax aims to look and act like.

It'd also be best to write the validation code only once and call it from multiple locations, rather than duplicating the code. This will make it much easier to maintain, as there will be less danger of the multiple copies getting out of sync.

-- brion

Platonides

11:17 p.m.

Brion Vibber wrote:

...

On Wed, Jan 12, 2011 at 5:53 PM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...
Hey,

...
I think <display map> would be parsed as the tag hook "display" with a

parameter "map"="map". Would this prevent any use of the hook registered as the tag "display map"...?

This code has been there for a few months now, and appears to be working as expected. The tag extension gets handled properly and no "map" parameter is passed along.

That seems highly contrary to expectations when dealing with tag hooks. What if there's a tag hook "display" registered by another plugin, does it get overridden? Overridden only if a "map" parameter is supplied?

-- brion

It is order dependant. If you install the extension providing 'display', then the one providing 'display map', the latter will never be called. If 'display map' gets registered first, both will work (your users might get annoyed, though).

Guess what I tried to do in r80025? :)

Daniel Friesen

13 Jan 13 Jan

2:19 a.m.

On 11-01-12 03:17 PM, Platonides wrote:

...

Brion Vibber wrote:

...
On Wed, Jan 12, 2011 at 5:53 PM, Jeroen De Dauwjeroendedauw@gmail.comwrote:

...
Hey,

...
I think<display map> would be parsed as the tag hook "display" with a

parameter "map"="map". Would this prevent any use of the hook registered as the tag "display map"...?

This code has been there for a few months now, and appears to be working as expected. The tag extension gets handled properly and no "map" parameter is passed along.

That seems highly contrary to expectations when dealing with tag hooks. What if there's a tag hook "display" registered by another plugin, does it get overridden? Overridden only if a "map" parameter is supplied?

-- brion

It is order dependant. If you install the extension providing 'display', then the one providing 'display map', the latter will never be called. If 'display map' gets registered first, both will work (your users might get annoyed, though).

Guess what I tried to do in r80025? :)

<tag name>...</tag name> Definitely sounds like something we shouldn't support. That kind of thing just screams "you're at the mercy of whatever developer in the future decides to change the parser to 'actually' parse out xml tags with more xml or html-like parsing, instead of simply doing a search and replace based on a list of tag names". Not to mention it's confusing and counterintuitive in articles themselves. What if the <display> tag actually supported a map argument? <display map foo> gets treated as a <display map> tag, but <display foo map> gets properly treated as a <display> tag? bleeeeech... And I have to feel sorry for the users that actually understand any small level of xml or html that are looking at <display map> in a random page and reading that as a tag with the tagName 'display'.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Jeroen De Dauw

2:52 a.m.

Hey,

Although the tag name issue is a valid point, it has nothing to do with my original email. So please start a separate discussion rather then hijacking this thread.

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

On 13 January 2011 03:19, Daniel Friesen lists@nadir-seen-fire.com wrote:

...

<tag name>...</tag name> Definitely sounds like something we shouldn't support. That kind of thing just screams "you're at the mercy of whatever developer in the future decides to change the parser to 'actually' parse out xml tags with more xml or html-like parsing, instead of simply doing a search and replace based on a list of tag names". Not to mention it's confusing and counterintuitive in articles themselves. What if the <display> tag actually supported a map argument? <display map foo> gets treated as a <display map> tag, but <display foo map> gets properly treated as a <display> tag? bleeeeech... And I have to feel sorry for the users that actually understand any small level of xml or html that are looking at <display map> in a random page and reading that as a tag with the tagName 'display'.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Robert Leverington

6:38 a.m.

On 2011-01-13, Jeroen De Dauw wrote:

...

Hey,

Although the tag name issue is a valid point, it has nothing to do with my original email. So please start a separate discussion rather then hijacking this thread.

It is entirely to do with your originl e-mail, as it is a valid reason why your original request (inclusion of Validator in core) should not be carried out at this time, and discussion of it is important (at least if you intend to improve the extension to overcome this issue).

Robert

Jeroen De Dauw

2:18 p.m.

Hey,

...

It is entirely to do with your originl e-mail, as it is a valid reason why your original request (inclusion of Validator in core) should not be carried out at this time, and discussion of it is important (at least if you intend to improve the extension to overcome this issue).

This is just one single issue, and stands rather loose from the question if the extension should be included or not. You will always be able to find issues of some sort, so with this kind of focus on superficial bugs (this really is, it can be fixed in under a minute), and avoiding the real question of inclusion, nothing will ever happen.

First reach an agreement on if it's a good idea or not; only then take care of the small changes that might needed to be taken before including.

Cheers

-- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. --

Ilmari Karonen

6:56 p.m.

On 01/13/2011 08:38 AM, Robert Leverington wrote:

...

On 2011-01-13, Jeroen De Dauw wrote:

...
Hey,

Although the tag name issue is a valid point, it has nothing to do with my original email. So please start a separate discussion rather then hijacking this thread.

It is entirely to do with your originl e-mail, as it is a valid reason why your original request (inclusion of Validator in core) should not be carried out at this time, and discussion of it is important (at least if you intend to improve the extension to overcome this issue).

Huh? Admittedly, I haven't reviewed the extension, but as far as I can tell from reading this discussion the two issues (merging of Validator into core, and the fact that the parser currently allows spaces in tag names) seem completely unrelated.

-- Ilmari Karonen

Daniel Friesen

30 Jan 30 Jan

12:07 a.m.

While reviewing SubPageList to see if it was good enough quality to install on a wiki of mine, I came round to Validator double checking it (since SubPageList uses Validator).

From the looks of the code, Validator, and various Validator based extensions appear to be using parser->parse() inside of hooks where they are supposed to be using ->recursiveTagParse with a proper frame. The Validator extension's api appears to provoke this bad practice because I don't see the frame in the arguments render methods are using.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

On 11-01-11 04:23 PM, Platonides wrote:

...

Well, I just had an issue with Validator, so I am not too sympatatic with your extension right now ;) After grepping for setHook, it turns out that an extension like Maps, that has zero matches, sets parser hooks indirectly via Validator extension. And not only that, but it also sets a hook for a different name. It seems to set a hook for<display_map> but actually sets it for <display map> (Why??) so that even looking for the full tag name doesn't give you any result.

It makes things more complex. On the maps case, the parameters need a deal more validation, but for most cases $parser->setHook() is clearer than registering a hook to class::staticInit() for a class which extends ParserHook and has some functions returning the hook configuration.

Jeroen De Dauw

12:42 a.m.

Thanks for pointing this out; I'll fix it soonish.

Please post such issues on the extensions discussion page or bugzilla. They have nothing to do with the Subject of this thread.

Send from my Android phone. On 30 Jan 2011 01:30, "Daniel Friesen" lists@nadir-seen-fire.com wrote:

...

While reviewing SubPageList to see if it was good enough quality to install on a wiki of mine, I came round to Validator double checking it (since SubPageList uses Validator).

From the looks of the code, Validator, and various Validator based extensions appear to be using parser->parse() inside of hooks where they are supposed to be using ->recursiveTagParse with a proper frame. The Validator extension's api appears to provoke this bad practice because I don't see the frame in the arguments render methods are using.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

On 11-01-11 04:23 PM, Platonides wrote:

...
Well, I just had an issue with Validator, so I am not too sympatatic with your extension right now ;) After grepping for setHook, it turns out that an extension like Maps, that has zero matches, sets parser hooks indirectly via Validator extension. And not only that, but it also sets a hook for a different name. It seems to set a hook for<display_map> but actually sets it for <display map> (Why??) so that even looking for the full tag name doesn't give you any result.

It makes things more complex. On the maps case, the parameters need a deal more validation, but for most cases $parser->setHook() is clearer than registering a hook to class::staticInit() for a class which extends ParserHook and has some functions returning the hook configuration.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Jeroen De Dauw

10:46 p.m.

Hey,

...

From the looks of the code, Validator, and various Validator based

extensions appear to be using parser->parse() inside of hooks where they are supposed to be using ->recursiveTagParse with a proper frame.

I was not aware of the correct approach here apparently. I had a look at the docs and hope I got it right now.

...

The Validator extension's api appears to provoke this bad practice because

I don't see the frame in the arguments render methods are using.

Now all info you need to handle parser functions and tag extensions separately is available in the render method. I also added a small utility function meant to abstract the difference, for people that like me do not want to have to care about if the call is coming from a tag or function.

Changes: * https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Special:Code/MediaWiki... * https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Special:Code/MediaWiki...

Does this address the issue correctly?

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --

Daniel Friesen

31 Jan 31 Jan

1:02 a.m.

On 11-01-30 02:46 PM, Jeroen De Dauw wrote:

...

Hey,

...
From the looks of the code, Validator, and various Validator based

extensions appear to be using parser->parse() inside of hooks where they are supposed to be using ->recursiveTagParse with a proper frame.

I was not aware of the correct approach here apparently. I had a look at the docs and hope I got it right now.

...
The Validator extension's api appears to provoke this bad practice because

I don't see the frame in the arguments render methods are using.

Now all info you need to handle parser functions and tag extensions separately is available in the render method. I also added a small utility function meant to abstract the difference, for people that like me do not want to have to care about if the call is coming from a tag or function.

Changes:

https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Special:Code/MediaWiki...

https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Special:Code/MediaWiki...

Does this address the issue correctly?

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --

Partially, but no.

$parser->parse should never be used within something called by $parser->parse (ie: Any parser function, tag hook, tag function hook, parser hook, etc...) (it's still being used in one branch).

What you use varies depending on what type of output is default for the type of tag/function you have, and what you intend to output.

From memory: A parser function by default is supposed to return WikiText with things like variables and other parserfunctions and whatnot expanded. You do this with $frame->expand( $text );. You do have the option of returning raw HTML, iirc you do this using `return array( $html, 'isHTML' => true );`. If you're returning html $parser->recursiveTagParse( $text, $frame ); gets you the html. (there were some other possible output options, I can't remember the exact results... I rarely output raw html from parser functions, it's not really their purpose)

iirc tag hooks default to outputting html, so naturally you use $parser->recursiveTagParse( $text, $frame ); to get full html back.

Muddying the water a little more, we don't have just parser functions and tags... We have setFunctionHook (parser functions), setHook (old style tag hooks), and setFunctionTagHook (new style function tag hooks). We also have setTransparentTagHook, which I haven't used much myself, though I believe it's more for creating things like <div>.

setHook and setFunctionTagHook both set <tag> style hooks. Originally we just had setHook, it had one short argument list. Later on that argument list was changed to add $frame to it. the newer setFunctionTagHook uses a different argument order, which appears to be the same set of arguments, just the first two and last two seam to be flipped. The primary difference is that setHook <tag> hooks are parsed in an old style, iirc pre-1.12 style. setFunctionTagHook are parsed by the preprocessor that parses parser functions, the contents aren't preprocessed but it's still parsed by the preprocessor. I'm not exactly sure of the difference, but setFunctionTagHook definitely exists because of preprocessor improvements and because applying the new parsing behaviors to setHook would result in different behavior for tags or code bugs... perhaps attr="{{{1}}}"'s are expanded in setFunctionTagHook's because they are parsed by the preprocessor. I'm not quite sure either what setFunctionTagHook's output is, or what the setFunctionHook style $flags are for, or support for the parser function style return of an array.

Tim probably needs to point something out to me.

The flaw I see here is Validator's render is coded in a way that you're always expecting to return html. Parser functions from the very start were intended to output WikiText, not html. Straight to html kills possibilities like [[{{#fn:...}}]] and subst:, you end up creating isolated sections where html is rendered out of scope with the rest of the document, which has the potential to not play well with markup outside of the block (partial tables?), and now you're making multiple recursive calls to the portions of code handling links, tables, other bits of html and whatnot, instead of doing that all at once. Now you have extensions that are writing WikiText and having them parsed to html, when in the case of a parser function you could be just expanding that wikitext and outputting it directly instead of going to html and ending up with more efficient and flexible extensions. And you can't simply say (if they called $this->parseWikitext( ... ); and this is a parser function I'll just output wikitext instead of html. Since it's possible that they actually wrote something like `$html .= '<object>...</object><nav>' . $this->parseWikitext( ... ) . '</nav>';` and now you have WikiText and html intermixed. I expect the extension isn't making good use of the new SFH_OBJ_ARGS functionality and how the new parser (well, new if we're back in the 1.12 days) is programmed so that extensions can (and ideally should be) coded so that expensive parser functions and other parts of wikitext are not expanded and run when they're not actually used.

Validator's parser hooks really need a method of differentiating between "I'm building raw html, no way around that, I might have things like <script><object>etc... which don't exist in WikiText because that would be insecure. But I have this chunk of WikiText I'd like to have expanded to mix in with the html." and "I'm building this with WikiText, please do whatever is necessary. Whether that is expanding wikitext and outputting it normally from a parser function, or running recursiveTagParse where raw html is expected." Though what Validator really needs is for one of the few of us that managed to get a good understanding of the parser and still stay (somewhat) insane... whoops... I mean sane... *cough* to be convinced to help out with the extension, probably rewriting half of the parser hook related code and it's api. As it stands it looks like Validator is selling the many flexible options of the parser quite short. And for minimal abstraction. dpl style (other extensions use it to) inline arguments lists (ie: <foo>a=b\nc=d</foo> style) seams like a good candidate that Validator misses out and makes a little tricker to implement in the best way they can (that expansion stuff again).

I haven't even gone into all the variations that can be put on stuff you hook into the parser (and that's without using wgHooks and doing any messy parsing, I'm just talking about MW's own preprocessing and parsing of syntax); Parser functions. Substing. Three different levels of <tag> hooking. The SFH_OBJ_ARGS improvements, ie: $frame->expand on args. The built in handling of named and numeric arguments to parser functions use SFH_OBJ_ARGS (aye, with SFH_OBJ_ARGS {{#foo:bar|a=b|c=d}} the foo parser function gets $args in a format where the = is already processed and it can get key => "a" value => "b" style information [going back to expansion again], and to top it off it's done in a way such that {{#foo:bar|{{{1}}}=asdf}} will not break if {{{1}}} just happens to contain a = (which it potentially would if you used old style parsing or expanded and looked for the = yourself)). Frame expansion also can selectively expand in a way that omits templates, or variables, or recovers comments, recovers source, etc... You can also implode arguments going back to wikitext you can output (like how the parser outputs syntax when running into some transclusion errors or supports nowiki: transclusions and some cases of subst:). You already know parser functions can output wikitext and optionally html, you can also set found=false, which I believe can have an effect similar to transclusion errors (ie:outputting the original source). There are some others too. Depending on what you're doing there's also link armoring and two strip states, etc...

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Jeroen De Dauw

3:58 p.m.

Hey,

That makes it rather clear I'm not familiar enough with the parser to implement a wrapper class such as ParserHook. Most extension developers will likely not know a lot more then me about it, so I think there definitely is a need to provide some abstraction for the most common use cases, which is one of the reasons why I created this class in the first place. I'd be totally awesome if someone that does know the parser well enough could fix up the parsing related methods (or help me do that (which will probably take more time though)).

Either way, the ParserHook class is not a core component of Validator, and can simply be left out if it blocks inclusion into core.

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --

Platonides

5:36 p.m.

Daniel Friesen wrote:

...

setHook (old style tag hooks), and setFunctionTagHook (new style function tag hooks).

setHook and setFunctionTagHook both set <tag> style hooks. Originally we just had setHook, it had one short argument list. Later on that argument list was changed to add $frame to it. the newer setFunctionTagHook uses a different argument order, which appears to be the same set of arguments, just the first two and last two seam to be flipped.

...

I'm not exactly sure of the difference, but setFunctionTagHook definitely exists because of preprocessor improvements and because applying the new parsing behaviors to setHook would result in different behavior for tags or code bugs... perhaps attr="{{{1}}}"'s are expanded in setFunctionTagHook's because they are parsed by the preprocessor.

setFunctionTagHook was added so that you could expand variables inside tags. Then setHook() got added $frame and the need for such hook type disappeared. I'd like to kill it, although it has been there for some time (if any list reader uses it, please stand up).

...

or what the setFunctionHook style $flags are for

It's unused.

...

The primary difference is that setHook <tag> hooks are parsed in an old style, iirc pre-1.12 style. setFunctionTagHook are parsed by the preprocessor that parses parser functions, the contents aren't preprocessed but it's still parsed by the preprocessor.

The preprocessor doesn't mind about it. The preprocessor makes one block with everything between <tags> whereas it enters into braces blocks. No matter which kind of function tag is done, the extension itself need to recursiveTagParse it.

Daniel Friesen

9:29 p.m.

On 11-01-31 09:36 AM, Platonides wrote:

...

Daniel Friesen wrote:

...
setHook (old style tag hooks), and setFunctionTagHook (new style function tag hooks).

setHook and setFunctionTagHook both set<tag> style hooks. Originally we just had setHook, it had one short argument list. Later on that argument list was changed to add $frame to it. the newer setFunctionTagHook uses a different argument order, which appears to be the same set of arguments, just the first two and last two seam to be flipped.

...
I'm not exactly sure of the difference, but setFunctionTagHook definitely exists because of preprocessor improvements and because applying the new parsing behaviors to setHook would result in different behavior for tags or code bugs... perhaps attr="{{{1}}}"'s are expanded in setFunctionTagHook's because they are parsed by the preprocessor.

setFunctionTagHook was added so that you could expand variables inside tags. Then setHook() got added $frame and the need for such hook type disappeared. I'd like to kill it, although it has been there for some time (if any list reader uses it, please stand up).

...
or what the setFunctionHook style $flags are for

It's unused.

...
The primary difference is that setHook<tag> hooks are parsed in an old style, iirc pre-1.12 style. setFunctionTagHook are parsed by the preprocessor that parses parser functions, the contents aren't preprocessed but it's still parsed by the preprocessor.

The preprocessor doesn't mind about it. The preprocessor makes one block with everything between<tags> whereas it enters into braces blocks. No matter which kind of function tag is done, the extension itself need to recursiveTagParse it.

I think there's a little more difference between setHook and setFunctionTagHook than you mention. At the very least, extensionSubstitution outputs a function tag hook directly, while putting a normal tag hook into the general strip state and outputting a marker.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Platonides

11:36 p.m.

Daniel Friesen wrote:

...

I think there's a little more difference between setHook and setFunctionTagHook than you mention. At the very least, extensionSubstitution outputs a function tag hook directly, while putting a normal tag hook into the general strip state and outputting a marker.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Good point. I should check if there's any benefit of doing it there instead of a recursive parse.

5081

Age (days ago)

5101

Last active (days ago)

wikitech-l@lists.wikimedia.org

21 comments

6 participants

tags (0)

participants (6)

Brion Vibber
Daniel Friesen
Ilmari Karonen
Jeroen De Dauw
Platonides
Robert Leverington