Enabling some string functions

List overview All Threads
Download

newer

older

Technical solution to the...

Bugzilla Weekly Report

Aryeh Gregor

26 Jun 2009 26 Jun '09

1:03 a.m.

A while ago, StringFunctions got merged in with ParserFunctions. Tim disabled them by default before scapping, with the following comment:

/** * Enable string functions. * * Set this to true if you want your users to be able to implement their own * parsers in the ugliest, most inefficient programming language known to man: * MediaWiki wikitext with ParserFunctions. * * WARNING: enabling this may have an adverse impact on the sanity of your users. * An alternative, saner solution for embedding complex text processing in * MediaWiki templates can be found at: http://www.mediawiki.org/wiki/Extension:Lua */

I'm sure we all agree that wikitext is terrible syntax. But some of the string functions already are at least partially replicated (with horrifying inefficiency, and significant limitations in some cases) on enwiki anyway. Specifically:

* #len is implemented by [[Template:Str len]]. Running {{str len}} it on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes. * #pos is implemented by [[Template:Str find]]. Trying to find b in a string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes. * #substr is implemented by [[Template:Str sub]]. Using the same string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes.

Is there any good reason not to enable these three string functions, at least?

Show replies by date

Tim Starling

26 Jun 26 Jun

3:33 a.m.

Aryeh Gregor wrote:

...

#len is implemented by [[Template:Str len]]. Running {{str len}} it

on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes.

#pos is implemented by [[Template:Str find]]. Trying to find b in a

string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes.

#substr is implemented by [[Template:Str sub]]. Using the same

string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes.

Is there any good reason not to enable these three string functions, at least?

Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions.

The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel.

-- Tim Starling

Brian

3:38 a.m.

Speaking of which, Lua looks like a wonderful solutin. We just need a more native integration into MediaWiki that hides the fact that you're actually using Lua. It doesn't look like Extension:Lua even works right now (per the talk page - not my actual testing). What do you think Tim?

On Thu, Jun 25, 2009 at 9:33 PM, Tim Starling tstarling@wikimedia.orgwrote:

...

Aryeh Gregor wrote:

...

#len is implemented by [[Template:Str len]]. Running {{str len}} it

on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes.

#pos is implemented by [[Template:Str find]]. Trying to find b in a

string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes.

#substr is implemented by [[Template:Str sub]]. Using the same

string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes.

Is there any good reason not to enable these three string functions, at

least?

Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions.

The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel.

-- Tim Starling

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Robert Rohde

4:01 a.m.

On Thu, Jun 25, 2009 at 8:33 PM, Tim Starlingtstarling@wikimedia.org wrote:

...

Aryeh Gregor wrote:

...

#len is implemented by [[Template:Str len]]. Running {{str len}} it

on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes.

#pos is implemented by [[Template:Str find]]. Trying to find b in a

string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes.

#substr is implemented by [[Template:Str sub]]. Using the same

string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes.

Is there any good reason not to enable these three string functions, at least?

Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions.

The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel.

Could you offer a bit more beyond "I don't like it"? A few devs, and you in particular, have expressed dismay over what string functions would do to wiki template code. However, most devs are rarely if ever involved with writing wiki templates.

By contrast, the community of people who do work on such templates have been asking for these functions for literally years and don't seem the least bit afraid that the marginal impact of adding a few more parser functions will bring the house down.

It is hard for me to figure why this case is so peculiar that the devs should block the wishes of the community. Nor do I see why the existence of basic string functionality should be dependent on someone overhauling or replacing the template coding scheme.

-Robert Rohde

Brian

4:07 a.m.

They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code.

As an example, yesterday I wrote some code that basically says, "check the doi and http template parameters and check to make sure they begin with http, and if not add it." In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions.

#[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}| http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp... {{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much.

On Thu, Jun 25, 2009 at 10:01 PM, Robert Rohde rarohde@gmail.com wrote:

...

On Thu, Jun 25, 2009 at 8:33 PM, Tim Starlingtstarling@wikimedia.org wrote:

...
Aryeh Gregor wrote:

...

#len is implemented by [[Template:Str len]]. Running {{str len}} it

on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes.

#pos is implemented by [[Template:Str find]]. Trying to find b in a

string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes.

#substr is implemented by [[Template:Str sub]]. Using the same

string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes.

Is there any good reason not to enable these three string functions, at

least?

...
Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions.

The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel.

Could you offer a bit more beyond "I don't like it"? A few devs, and you in particular, have expressed dismay over what string functions would do to wiki template code. However, most devs are rarely if ever involved with writing wiki templates.

By contrast, the community of people who do work on such templates have been asking for these functions for literally years and don't seem the least bit afraid that the marginal impact of adding a few more parser functions will bring the house down.

It is hard for me to figure why this case is so peculiar that the devs should block the wishes of the community. Nor do I see why the existence of basic string functionality should be dependent on someone overhauling or replacing the template coding scheme.

-Robert Rohde

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Robert Rohde

4:20 a.m.

On Thu, Jun 25, 2009 at 9:07 PM, BrianBrian.Mingus@colorado.edu wrote:

...

They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code.

As an example, yesterday I wrote some code that basically says, "check the doi and http template parameters and check to make sure they begin with http, and if not add it." In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions.

#[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}| http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp... {{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much.

Yes, template code can suck, and that's a fine example. But how is adding or not adding string functions going to make a significant difference to that? How is it different that {{#expr:}} or different from creating {{#if:}} to replace {{qif}}, etc.?

I don't see why the fact that template code is a mess should bear on the orthogonal question of providing string functionality to the community. I'm sure that if someone ever does create a better template coding system then many people will quickly migrate to it, but why should that need to come first?

-Robert Rohde

Gregory Maxwell

5:03 a.m.

On Fri, Jun 26, 2009 at 12:20 AM, Robert Rohderarohde@gmail.com wrote:

...

...
#[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}| http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp... {{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much.

Yes, template code can suck, and that's a fine example. But how is adding or not adding string functions going to make a significant difference to that? How is it different that {{#expr:}} or different from creating {{#if:}} to replace {{qif}}, etc.?

I don't see why the fact that template code is a mess should bear on the orthogonal question of providing string functionality to the community. I'm sure that if someone ever does create a better template coding system then many people will quickly migrate to it, but why should that need to come first?

Because "backwards compatibility is forever", for one thing. The relevant performance metric here is as much the worst case performance as it is the average. That an expensive feature is rarely used doesn't help when you have to maintain it do sanely display old versions and if one of those pages gets a lot of traffic things blow up.

The performance problems today happened in spite of it being below the daily peak overall request rate: http://www.nedworks.org/~mark/reqstats/reqstats-daily.png ...just a thundering herd on a few expensive operations.

Enabling more features increases investment in the old system and will make migration slower and more difficult to accomplish.

Additional advanced features will also encourage additional 'expensive' usage.

The message here is that wikimarkup is not intended to be a programming language. If you find yourself asking for more programming language features the shortcoming is in your expectations not in the software. It's already gone too far … What we've got now is the moral equivalent of brainfuck but without the elegance.

People have been asking these features for years. True. But thats also good evidence that they can live a while longer without it.

Dmitriy Sintsov

5:35 a.m.

* Gregory Maxwell gmaxwell@gmail.com [Fri, 26 Jun 2009 01:03:01 -0400]:

...

The message here is that wikimarkup is not intended to be a programming language. If you find yourself asking for more programming language features the shortcoming is in your expectations not in the software. It's already gone too far … What we've got now is the moral equivalent of brainfuck but without the elegance.

It's already a programming language, though a bit limited one and not very well readable.

...

People have been asking these features for years. True. But thats also good evidence that they can live a while longer without it.

A good built-in language would expand the usage of MediaWiki beyond Wikipedia (for example with SMW/SF extensions), though it's probably has no use to Wikimedia Foundation. Dmitriy

Tim Starling

5:35 a.m.

Robert Rohde wrote:

...

How is it different that {{#expr:}} or different from creating {{#if:}} to replace {{qif}}, etc.?

I think introducing #if was a mistake. I should have taken this stand at the time.

Introducing string functions increases the domain of problems which can be solved by wikitext templates. That expanded domain includes some particularly complex problems which Wikipedians are incentivised to solve, such as natural language processing. Bug 6455 already has proposals from Wikipedians for merging multiple template parameters into a parsed configuration field.

In an earlier post:

...

By contrast, the community of people who do work on such templates have been asking for these functions for literally years and don't seem the least bit afraid that the marginal impact of adding a few more parser functions will bring the house down.

The community of people who work on such templates is an extremely small, self-selected subset of the community of editors. It is that tiny segment of the community that can code in this accidental programming language, who are not deterred by its density, inconsistency or performance limitations.

The issue with complex templates is that they deter contributions not only from the majority of editors, but even from the majority of technically-inclined editors, who know a programming language or two. So edits to this important subset of Wikipedia are left to a small elite.

While some template authors might attempt to make their templates accessible, the nature of Wikipedia is such that less-accessible contributions tend to accumulate.

Once we introduce these string functions, the accumulation of complex and inaccessible templates that use them will begin. Introducing a scripting language will not make those accumulated contributions disappear. The task of deciphering them, and converting them to a more accessible form, will remain.

-- Tim Starling

Robert Rohde

6:10 a.m.

On Thu, Jun 25, 2009 at 10:35 PM, Tim Starlingtstarling@wikimedia.org wrote: <snip>

...

The community of people who work on such templates is an extremely small, self-selected subset of the community of editors. It is that tiny segment of the community that can code in this accidental programming language, who are not deterred by its density, inconsistency or performance limitations.

There is some truth to this. However, I believe the community of people who would like to see string functions is much, much larger, than just the community of template coders. Most Wikipedians can use templates even if they don't feel comfortable creating them, and many of them have at one time or another encountered practical problems that could be solved with basic string functionality.

<snip>

...

Introducing a scripting language will not make those accumulated contributions disappear. The task of deciphering them, and converting them to a more accessible form, will remain.

Do you actually have a plan for introducing a scripting language?

Lua, which seems to your favored strategy, was recently LATER-ed on bugzilla by Brion, and suffers from several serious problems. For example the dependency on compiled binaries is highly undesirable. The relative power of a full programming language would require limiting its resources to avoid bad code consuming all memory or flooding Mediawiki with output, and that is only the starting point for considering the risks of malicious or overtaxing code. Not to mention that the comments at Extension talk:Lua suggest several people have failed in attempts to get the Extension working at all.

Even if one gets past that, Lua brings its own grammar, set of function keywords, and methodologies, which will again create a high barrier to participation for people wanting to work with it.

Frankly Lua feels like it creates at least as many usability and portability problems as it solves, and is still a long ways off.

Werdna's suggestion to adapt the AbuseFilter parser into a home-grown Mediawiki scripting language feels lot more natural in terms of control and ability to affect an integrated presentation, but that would also seem quite distant.

If one is going to say "no string functions until the template coding problem is solved", then I'd liked to know if there is really a serious strategy for doing that.

-Robert Rohde

Stephen Bain

8:44 a.m.

On Fri, Jun 26, 2009 at 2:07 PM, BrianBrian.Mingus@colorado.edu wrote:

...

As an example, yesterday I wrote some code that basically says, "check the doi and http template parameters and check to make sure they begin with http, and if not add it." In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions.

#[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}| http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp... {{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

On Fri, Jun 26, 2009 at 3:35 PM, Tim Starlingtstarling@wikimedia.org wrote:

...

While some template authors might attempt to make their templates accessible, the nature of Wikipedia is such that less-accessible contributions tend to accumulate.

In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable.

Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above.

-- Stephen Bain stephen.bain@gmail.com

Roan Kattouw

10:33 a.m.

2009/6/26 Stephen Bain stephen.bain@gmail.com:

...

In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable.

Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above.

Maybe it's the mindset that should be changed then? For one thing, {{link}} used to use {{substr}} to check if the first argument started with http:// , https:// or ftp:// and produced an internal link if not, despite the fact that the documentation for {{link}} clearly states that it creates an *external* link, which means people shouldn't be using it to create internal links. If people try to use a template for something it's not intended for, they should be told to use a different template; currently, it seems like the template is just extended with new functionality, leading unnecessary {{#if: , {{#switch: and {{substr}} uses that serve only the users' laziness.

To get back to {{cite}}: the template itself contains no more than some logic to choose between {{Citation/core}} and {{Citation/patent}} based on the presence/absence of certain parameters, and {{Citation/core}} does the same thing to choose between books and periodicals. What's wrong with breaking up this template in, say, {{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly, other multifunctional templates could be broken up as well.

The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar}}}} . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. This means both {{foo}} and {{bar}} get expanded, one of which in vain. Of course this is even worse for complex systems of nested #if/#ifeq statements and/or #switch statements, in which every possible 'code' path is evaluated before a decision is made. In practice, this means that for every call to {{cite}}, which seems to have three major modes, the preprocessor will spend about 2/3 of its time expanding stuff it's gonna throw away anyway.

To fix this, control flow parser functions such as #if could be put in a special class of parser functions that take their arguments unexpanded. They could then call the parser to expand their first argument and return a value based on that. Whether these functions are expected to return expanded or unexpanded wikitext doesn't really matter from a performance standpoint. (Disclaimer: I'm hardly a parser expert, Tim is; he should of course be the judge of the feasibility of this proposal.)

As an aside, lazy evaluation of #if statements would also improve performance for stuff like:

{{#if:{{{param1|}}}|Do something with param1 {{#if:{{{param2|}}}|Do something with param2 ... {{#if:{{{param9|}}}|Do something with param9}}}}}}}}}}}}}}}}}}

Roan Kattouw (Catrope)

Nikola Smolenski

11:19 a.m.

Roan Kattouw wrote:

...

To get back to {{cite}}: the template itself contains no more than some logic to choose between {{Citation/core}} and {{Citation/patent}} based on the presence/absence of certain parameters, and {{Citation/core}} does the same thing to choose between books and periodicals. What's wrong with breaking up this template in, say, {{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly, other multifunctional templates could be broken up as well.

While this is not a comment on merits of string functions in general, there are following wrong things with that approach:

- It is easier for users to remember the name of just a single template.

- Multiple templates that are separately maintained will diverge over time, for example same parameters might end being named differently.

- A new feature in one template can't be easily applied to another template.

Aryeh Gregor

2:16 p.m.

On Thu, Jun 25, 2009 at 11:33 PM, Tim Starlingtstarling@wikimedia.org wrote:

...

Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions.

The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel.

Well, then at least let's be consistent and cripple padleft/padright.

Also, while I disagree with Robert's skepticism about the comparative usability of a real scripting language, I'd be interested to hear what your ideas are for actually implementing that.

Come to think of it, the easiest scripting language to implement would be . . . PHP! Just run it through the built-in PHP parser, carefully sanitize the tokens so that it's safe (possibly banning things like function definitions), and eval()! We could even dump the scripts into lots of little files and use includes, so APC can cache them. That would probably be the easiest thing to do, if we need to keep pure PHP support for the sake of third parties. It's kind of horrible, of course . . .

How much of Wikipedia is your random shared-hosted site going to be able to mirror anyway, though? Couldn't we at least require working exec() to get infoboxes to work? People on shared hosting could use Special:ExpandTemplates to get a copy of the article with no dependencies, too (albeit with rather messy source code).

On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.kattouw@gmail.com wrote:

...

The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar}}}} . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook.

I thought this was fixed ages ago with the new preprocessor.

Roan Kattouw

2:27 p.m.

...

On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.kattouw@gmail.com wrote:

...
The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar}}}} . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook.

I thought this was fixed ages ago with the new preprocessor.

I asked Domas whether it was and he said no; Tim, can you chip in on this?

Roan Kattouw (Catrope)

Domas Mituzas

3:36 p.m.

...

...
I asked Domas whether it was and he said no; Tim, can you chip in on this?

where did I say no, and what was my 'no' about?

-- domas

Robert Rohde

3:52 p.m.

On Fri, Jun 26, 2009 at 7:16 AM, Aryeh GregorSimetrical+wikilist@gmail.com wrote:

...

On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.kattouw@gmail.com wrote:

...
The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar}}}} . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook.

I thought this was fixed ages ago with the new preprocessor.

My understanding has been that the PREprocessor expands all branches, by looking up and substituting transcluded templates and similar things, but that the actual processor only evaluates the branches that it needs. That's a lot faster than actually evaluating all branches (which is how things originally worked), but not quite as effective as if the dead branches were ignored entirely.

(I could be totally wrong however.)

-Robert Rohde

Roan Kattouw

4:17 p.m.

2009/6/26 Robert Rohde rarohde@gmail.com:

...

My understanding has been that the PREprocessor expands all branches, by looking up and substituting transcluded templates and similar things, but that the actual processor only evaluates the branches that it needs. That's a lot faster than actually evaluating all branches (which is how things originally worked), but not quite as effective as if the dead branches were ignored entirely.

(I could be totally wrong however.)

You're right that dead code never reaches the parser (your "processor"), but ideally the preprocessor wouldn't bother expanding it either. I have vague recollection that it was fixed with the new preprocessor, as Simetrical said, but I have no idea how much truth there is in that.

Roan Kattouw (Catrope)

Tim Starling

27 Jun 27 Jun

6:02 a.m.

Aryeh Gregor wrote:

...

On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.kattouw@gmail.com wrote:

...
The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar}}}} . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook.

I thought this was fixed ages ago with the new preprocessor.

Yes it was fixed in 1.12 (late 2007), as I have repeatedly told this list. The new "if" parser function is passed a placeholder object which can be expanded on demand.

-- Tim Starling

Brian

26 Jun 26 Jun

2:32 p.m.

On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain stephen.bain@gmail.comwrote:

...

In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable.

Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above.

The good old days are long gone. If you believe there is never a valid case for basic programming constructs such as conditionals you should have objected when ParserFunctions were first implemented.

Andrew Garrett

3:49 p.m.

On 26/06/2009, at 3:32 PM, Brian wrote:

...

On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain stephen.bain@gmail.comwrote:

...
In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable.

Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above.

The good old days are long gone. If you believe there is never a valid case for basic programming constructs such as conditionals you should have objected when ParserFunctions were first implemented.

The fact that we, at some stage, made the mistake of adding programming-like functions does not oblige us to complete the job.

If we could make ParserFunctions go away, we would. ParserFunctions is there now, and there's too much code dependent on it to remove it right now. That analysis does not apply to StringFunctions.

-- Andrew Garrett Contract Developer, Wikimedia Foundation agarrett@wikimedia.org http://werdn.us

Platonides

27 Jun 27 Jun

10:43 p.m.

Brian wrote:

...

They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code.

As an example, yesterday I wrote some code that basically says, "check the doi and http template parameters and check to make sure they begin with http, and if not add it." In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions.

#[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}| http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp... {{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much.

I have seen this before. People use #if for everything even when there is a better way. Look at what you're doing: {{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}

{{#if:{{{5}}}|{{{5}}} }}}} mean "show parameter 5 if it is set", or "show parameter 5 if it is not blank". In either case, {{{5|}}} would do the job.

The parent #if is simlar parameter 4 if set, else parameter 5. {{{4| {{{5|}}} }}} would do the job.

Template default parameters were here much before ParserFunctions. But people prefer using ugly #ifs, making syntax more unreadable (and increasing preprocessor limits).

Another common abuse is to do: {{#if: {{{Foo}}}| <tr><td>Foo: </td><td>{{{Foo}}} </td></tr> }}

I'd like to have a feature in the parser to mark a section to be skipped if the inner parameter is not set, without having to use #ifs everywhere.

Brian

10:49 p.m.

You seem confused. You seem to think that I care about the proper way to program using templates and parser functions. That's not true, they are an ugly hack and I recognize that. If have absolutely no desire to learn how to use something so hideously inefficient in an efficient manner.

On Sat, Jun 27, 2009 at 4:43 PM, Platonides Platonides@gmail.com wrote:

...

Brian wrote:

...
They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code.

As an example, yesterday I wrote some code that basically says, "check

the

...
doi and http template parameters and check to make sure they begin with http, and if not add it." In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions.

#[[{{{1}}}]].

{{#if:{{{4}}}|[|{{#if:{{{5}}}|[}}}}{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}|{{#if:{{{4}}}|

...
http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp...http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp://dx.doi.org/%7B%7B%7B5%7D%7D%7D%7D%7D%7D%7D

}}

...
{{#if:{{{2}}}| {{{2}}}}}{{#if:{{{4}}}|]|{{#if:{{{5}}}|]}}}} {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}}

There is some extra stuff in there, but you get my point. Just because a

few

...
people really, really want extra functionality at any cost doesn't mean much.

I have seen this before. People use #if for everything even when there is a better way. Look at what you're doing: {{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}}}}}

{{#if:{{{5}}}|{{{5}}} }}}} mean "show parameter 5 if it is set", or "show parameter 5 if it is not blank". In either case, {{{5|}}} would do the job.

The parent #if is simlar parameter 4 if set, else parameter 5. {{{4| {{{5|}}} }}} would do the job.

Template default parameters were here much before ParserFunctions. But people prefer using ugly #ifs, making syntax more unreadable (and increasing preprocessor limits).

Another common abuse is to do: {{#if: {{{Foo}}}|

<tr><td>Foo: </td><td>{{{Foo}}} </td></tr> }}

I'd like to have a feature in the parser to mark a section to be skipped if the inner parameter is not set, without having to use #ifs everywhere.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Platonides

28 Jun 28 Jun

5:56 p.m.

Brian wrote:

...

You seem confused. You seem to think that I care about the proper way to program using templates and parser functions. That's not true, they are an ugly hack and I recognize that. If have absolutely no desire to learn how to use something so hideously inefficient in an efficient manner.

Then you shouldn't be presenting examples of how it can't be implemented reasonably in template programming.

Almost any _reasonable programming language_ allows you to write ugly code if so you want. That doesn't prove the language is ugly.

Nonetheless... it's ugly :)

Dmitriy Sintsov

29 Jun 29 Jun

10:14 a.m.

...

Brian wrote:

...
You seem confused. You seem to think that I care about the proper

way

...

to

...
program using templates and parser functions. That's not true, they

are an

...
ugly hack and I recognize that. If have absolutely no desire to

learn

...

how to

...
use something so hideously inefficient in an efficient manner.

Then you shouldn't be presenting examples of how it can't be

implemented

...

reasonably in template programming.

Almost any _reasonable programming language_ allows you to write ugly code if so you want. That doesn't prove the language is ugly.

Nonetheless... it's ugly :)

Ugly or not, but having a kind of scripting inside the pages can be very much useful. It exteneds the flexibility of sites built on top of the MediaWiki. Probably one of reasons MediaWiki becomes more popular as website engine around the world. Maybe a better syntax and restriction to template namespace would be a good thing, though. I personally liked the idea of pre-parsed and checked limited subset of PHP operators for performance, though the security may be an issue. Dmitriy

5485

Age (days ago)

5488

Last active (days ago)

wikitech-l@lists.wikimedia.org

24 comments

12 participants

tags (0)

participants (12)

Andrew Garrett
Aryeh Gregor
Brian
Dmitriy Sintsov
Domas Mituzas
Gregory Maxwell
Nikola Smolenski
Platonides
Roan Kattouw
Robert Rohde
Stephen Bain
Tim Starling