Template ideal behavior

List overview All Threads
Download

newer

older

Re: [Wikitech-l] [MediaWiki-CVS]...

Re: [Wikitech-l] Wikitech-l...

Jim Wilson

3 Jul 2007 3 Jul '07

8:03 p.m.

Hi all,

I have some questions about the way the templating system would work in an ideal world (not necessarily how it works today). I'm attempting to write a template processor that mimics the ideal behavior of MW's templating engine.

I'm running into difficulties when considering edge cases. Take for example four nested brackets: {{{{a}}}}

I would have thought that this was equivalent to calling the template whose name is output after calling {{a}}. So if Template:A contains just "b", I'd expect {{{{a}}}} to be equivalent to {{b}} - however this is not the case, and the raw text "{{{{a}}}}" is returned.

In the #mediawiki IRC channel, someone helpfully suggested {{{{void}}{{a}}}} - which does what I expected {{{{a}}}} to do. However, this doesn't really help me from a requirements standpoint - as it doesn't answer whether {{{{a}}}}'s behavior is an intentional, conscientious design choice, or merely a missing feature.

Another thing I've noticed is that template parameters (three brackets) bind more tightly than template calls (two brackets). So five brackets like this {{{{{a}}}}} is similar to {{ {{{a}}} }}. However, six brackets like this {{{{{{a}}}}}} resolves to {{{ {{{a}}} }}} and the inner parameter is replaced leaving {{{a's value}}}. The outer triple is left as plain text.

Generally speaking, as long as there are breaks, it appears the parser can figure out the intent, which is why {{{{void}}{{a}}}} succeeds where {{{{a}}}} fails.

Any advice on how to handle these edge cases (nested brackets beyond 2 or 3) would be much appreciated.

-- Jim R. Wilson (jimbojw)

Show replies by date

Minute Electron

3 Jul 3 Jul

8:18 p.m.

On 7/3/07, Jim Wilson wilson.jim.r@gmail.com wrote:

...

Hi all,

I have some questions about the way the templating system would work in an ideal world (not necessarily how it works today). I'm attempting to write a template processor that mimics the ideal behavior of MW's templating engine.

I'm running into difficulties when considering edge cases. Take for example four nested brackets: {{{{a}}}}

I would have thought that this was equivalent to calling the template whose name is output after calling {{a}}. So if Template:A contains just "b", I'd expect {{{{a}}}} to be equivalent to {{b}} - however this is not the case, and the raw text "{{{{a}}}}" is returned.

In the #mediawiki IRC channel, someone helpfully suggested {{{{void}}{{a}}}}

which does what I expected {{{{a}}}} to do. However, this doesn't

really help me from a requirements standpoint - as it doesn't answer whether {{{{a}}}}'s behavior is an intentional, conscientious design choice, or merely a missing feature.

Another thing I've noticed is that template parameters (three brackets) bind more tightly than template calls (two brackets). So five brackets like this {{{{{a}}}}} is similar to {{ {{{a}}} }}. However, six brackets like this {{{{{{a}}}}}} resolves to {{{ {{{a}}} }}} and the inner parameter is replaced leaving {{{a's value}}}. The outer triple is left as plain text.

Generally speaking, as long as there are breaks, it appears the parser can figure out the intent, which is why {{{{void}}{{a}}}} succeeds where {{{{a}}}} fails.

Any advice on how to handle these edge cases (nested brackets beyond 2 or 3) would be much appreciated.

-- Jim R. Wilson (jimbojw) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

I feel that it would be benefititial to MediaWiki (and this problem) to have a discussion and consultation about rewriting wikitext. Currently there are many ambiuties that a specification would solve. If a speciifcation was created it would also help experienced users to work out why there was a problem with the wikitext they are using. By all means disregard this message (I am sure it has been suggested many times before) - but seriously, wikitext is in a really bad state. Thanks, MinuteElectron.

Thomas Dalton

8:30 p.m.

But should {{{{a}}}} be parsed as {{ {{a}} }} or { {{ {a} }} } or { {{{a}}} } or {{{ {a} }}}? Depending on context, any of those could be intended. I think not parsing it at all and leaving it as text is the only option - at least that way it is obvious what has gone wrong.

Jim Wilson

9:19 p.m.

Minute Electron schrieb:

...

I feel that it would be benefititial to MediaWiki (and this problem) to

have

...

a discussion and consultation about rewriting wikitext. Currently there

are

...

many ambiuties that a specification would solve.

Those are both interesting points. I'd like to divide that into two distinct thoughts, if I could:

1) Writing a spec on how wikitext works _today_ 2) Writing a spec on how wikitext should work _in the future_

The former has been addressed many times on this list, and although I think it's a valuable endeavor, I hope to set it outside the scope of this discussion. Here, I hope merely to talk about the latter, limited only to template language processing.

Thomas Dalton schrieb:

...

But should {{{{a}}}} be parsed as {{ {{a}} }} or { {{ {a} }} } or { {{{a}}} } or {{{ {a} }}}? Depending on context, any of those could be intended.

Those are fair questions - and maybe that's the answer. I tend to think that there should be a defined default case, with the others being achievable by other means.

Of the cases you mention, two seem very unlikely to me: { {{ {a} }} } since it relies on a template with curly braces in the name (not sure if this is allowed) and {{{ {a} }}} since it would require a parameter with a pair of curly braces in its name. I'd think this would be very unlikely - and currently isn't even possible.

The other option: { {{{a}}} } seems like it may have been what was intended in some cases. Especially if you define triples as being especially closely bound.

One thing which I had been taking for granted was the idea that there _should_ be a way to interpret most encountered strings - and that the system should tend towards functionality when a reasonable interpretation is available.

The other way to approach the problem is to not make assumptions about what the text means and throw up our hands when ambiguity arises. It would seem this is the way the system works today. Is this the right way?

...

I think not parsing it at all and leaving it as text is the only option - at least that way it is obvious what has gone wrong.

Well, not as obvious as a nice fat parse error would be. :)

-- Jim

On 7/3/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

But should {{{{a}}}} be parsed as {{ {{a}} }} or { {{ {a} }} } or { {{{a}}} } or {{{ {a} }}}? Depending on context, any of those could be intended. I think not parsing it at all and leaving it as text is the only option - at least that way it is obvious what has gone wrong.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

10:22 p.m.

...

The other option: { {{{a}}} } seems like it may have been what was intended in some cases. Especially if you define triples as being especially closely bound.

I think that's the most likely intention. I can't think of any time when you would want to call a template with the name coming from another template. I doesn't seem very unreasonable to have a parameter being displayed in curly brackets.

...

Well, not as obvious as a nice fat parse error would be. :)

Ah, now that's a good idea: If you don't know what someone means by something, ask them.

Platonides

4 Jul 4 Jul

2:29 a.m.

Thomas Dalton wrote:

...

...
The other option: { {{{a}}} } seems like it may have been what was intended in some cases. Especially if you define triples as being especially closely bound.

I think that's the most likely intention. I can't think of any time when you would want to call a template with the name coming from another template. I doesn't seem very unreasonable to have a parameter being displayed in curly brackets.

I am sure to have seen templates displaying the result from a template.

...

...
Well, not as obvious as a nice fat parse error would be. :)

Ah, now that's a good idea: If you don't know what someone means by something, ask them.

Fat parse errors are ugly for wikpedia readers (specially when you will be breaking existing code). Though i'd support a verboser mode to highlight parser warnings..

Rob Church

2:41 a.m.

On 03/07/07, Platonides Platonides@gmail.com wrote:

...

Fat parse errors are ugly for wikpedia readers (specially when you will be breaking existing code). Though i'd support a verboser mode to highlight parser warnings..

However, his first post made it quite clear, I think, that this is a separate solution. No-one has proposed to break anything on Wikipedia.

Rob Church

Thomas Dalton

3:21 a.m.

...

I am sure to have seen templates displaying the result from a template.

Templates which transclude templates are another matter entirely, and not one I believe there are any major problems with. We're talking about articles which call a template, the name of which is the result of another template. I've never seen that.

Phil Boswell

1:47 p.m.

Thomas Dalton wrote:

...

...
I am sure to have seen templates displaying the result from a template.

Templates which transclude templates are another matter entirely, and not one I believe there are any major problems with. We're talking about articles which call a template, the name of which is the result of another template. I've never seen that.

Happens all the time in Infoboxes and the like, where you want to be able to parameterise almost all of the display: colours and so forth. Also the various little flag icons you see around the place are IIRC implemented in such a fashion.

Bear in mind that a template which transcludes another template isn't actually doing much unless it's in an article, so that's the context in which most of this discussion should be viewed.

HTH HAND

-- Phil -- View this message in context: http://www.nabble.com/Template-ideal-behavior-tf4018695.html#a11426360 Sent from the Wikipedia Developers mailing list archive at Nabble.com.

Mark Clements

6:36 p.m.

"Phil Boswell" phil.boswell@gmail.com wrote in message news:11426360.post@talk.nabble.com...

...

Thomas Dalton wrote:

...
...
I am sure to have seen templates displaying the result from a template.

Templates which transclude templates are another matter entirely, and not one I believe there are any major problems with. We're talking about articles which call a template, the name of which is the result of another template. I've never seen that.

Happens all the time in Infoboxes and the like, where you want to be able

...

parameterise almost all of the display: colours and so forth. Also the various little flag icons you see around the place are IIRC implemented in such a fashion.

Bear in mind that a template which transcludes another template isn't actually doing much unless it's in an article, so that's the context in which most of this discussion should be viewed.

Not true. A lot of template documentation is transcluded from e.g. {{/doc}}, and templates in user space and talk space are very common indeed, not to mention the various navigation/administrative templates used in the other namespaces...

- Mark Clements (HappyDog)

Thomas Dalton

8:46 p.m.

...

Happens all the time in Infoboxes and the like, where you want to be able to parameterise almost all of the display: colours and so forth. Also the various little flag icons you see around the place are IIRC implemented in such a fashion.

That's the parameter of the template, rather than the name, being the result of a template. It's a similar issue, but it removes the risk of four braces at the beginning. You could still have them at the end, but that's not ambiguous: {{a|{{b}}}} clearly means {{a| {{b}} }}, there is no alternative, since the last 2 braces have to be paired with the first 2 otherwise it's not valid code. {{{{a}}}} has lots of possible interpretations (2 of which could easily come up).

Daniel Cannon

5:11 a.m.

On 7/3/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
Well, not as obvious as a nice fat parse error would be. :)

Ah, now that's a good idea: If you don't know what someone means by something, ask them.

Perhaps, but it's surely too late for that now. Any reformation of the wikimarkup system must, all costs, ensure backwards compatibility with the current wikimarkup. MediaWiki is a very widely used piece of software, and even a simple change like redefining {{{{x}}}} in any way--either having it twice-expand the template or having it resolve to a parsing error--could break a multitude of template systems across not just Wikipedia but hundreds of wikis, both on and off Wikimedia.

Any reformation should have as its goal the expansion of the wikimarkup's functionality whilst maintaining both its current functionality and syntax.

-- Daniel Cannon (AmiDaniel) http://amidaniel.com cannon.danielc@gmail.com

Rob Church

5:14 a.m.

On 04/07/07, Daniel Cannon cannon.danielc@gmail.com wrote:

...

Perhaps, but it's surely too late for that now. Any reformation of the wikimarkup system must, all costs, ensure backwards compatibility with the current wikimarkup. MediaWiki is a very widely used piece of software, and even a simple change like redefining {{{{x}}}} in any way--either having it twice-expand the template or having it resolve to a parsing error--could break a multitude of template systems across not just Wikipedia but hundreds of wikis, both on and off Wikimedia.

Sooner or later, we are going to have to define the parser behaviour once and for all. When that happens, there will, no doubt, be some backwards-incompatible changes in edge cases. This is unavoidable.

Rob Church

Thomas Dalton

5:16 a.m.

...

Sooner or later, we are going to have to define the parser behaviour once and for all. When that happens, there will, no doubt, be some backwards-incompatible changes in edge cases. This is unavoidable.

Indeed. And the longer we wait, the more there will be.

Jim Wilson

9:25 a.m.

Thanks everyone for weighing in - specifically:

Thanks Brion for explaining how template braces are interpreted - that was very helpful.

Thanks Rob for reiterating that at this time I'm not proposing any changes to the way things work now - just trying to get a feel for how things may operate in a world free of BC worries.

It isn't my intent to create a new template-language specification, the wikitext template language is familiar to a lot of people and reasonably succinct. Its use of consecutive curly braces is convenient since these are not used (afaik) in other popular light-markup languages, so there's low risk of a clash.

I'll get started using the principle Brion explained - that the longest pair sticks, with tightest pairs after that. The "spec" that I'm going to use for my templating adventure is as follows:

* Parameter definition: {{{param|default}}} * Template call definition: {{template|arg|named_arg=val}} * Function call definition: {{#function:arg1|arg2|arg3}} * Nesting rule: Longest, tightest pair wins

Regarding the call for a specification - maybe this task might be easier if it were broken into chunks? For example:

* Template and parameter syntax * Bullets and numbering * Tables * Links (internal and external)

I shudder to continue because such a list can get very long very fast (I had to stop myself, for example, from making a level under Links for Images and Categories). But you get the idea. IMO, a partial spec in any of the above areas would be better than none at all (and in several of those cases, a spec may already exist)

One final observation: template calls which are built from the results of other calls are not themselves executed.

Consider: * [[Template:left_braces]] = {{ * [[Template:right_braces]] = }} * [[Some Page]] = {{left braces}}a{{right braces}}

The resulting content on [[Some Page]] will be {{a}}, not the result of calling Template:A. This effect is similar to what makes the {{!}} template work, which makes it possible to supply table definitions inside template arguments.

The decision not to reparse restricts what's possible with template inclusion, but the alternative would appear to be worse. Recursion becomes a nasty problem if you allow the rendered results to be re-rendered, so I'm going to follow MW's lead and not re-parse text following template inclusion.

The reason I bring this up is that it's probably worth mentioning in the Template Spec - when such a document becomes available.

I realize that parts of this discussion may have moved outside the realm of applicability to the list, and I appreciate everyone's input. :)

-- Jim

On 7/3/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
Sooner or later, we are going to have to define the parser behaviour once and for all. When that happens, there will, no doubt, be some backwards-incompatible changes in edge cases. This is unavoidable.

Indeed. And the longer we wait, the more there will be.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Platonides

6:31 p.m.

Jim Wilson wrote:

...

The resulting content on [[Some Page]] will be {{a}}, not the result of calling Template:A. This effect is similar to what makes the {{!}} template work, which makes it possible to supply table definitions inside template arguments.

Maybe the {{!}} template should be put into core as {{|}} Just to avoid the need of an extra template to trick the parser. Languages without escape characters are always tricky at the best.

Simetrical

5 Jul 5 Jul

12:59 a.m.

On 7/4/07, Platonides Platonides@gmail.com wrote:

...

Jim Wilson wrote:

...
The resulting content on [[Some Page]] will be {{a}}, not the result of calling Template:A. This effect is similar to what makes the {{!}} template work, which makes it possible to supply table definitions inside template arguments.

Maybe the {{!}} template should be put into core as {{|}} Just to avoid the need of an extra template to trick the parser. Languages without escape characters are always tricky at the best.

Even better if we just had string delimiters.

Thomas Dalton

3:52 a.m.

...

Even better if we just had string delimiters.

For strings as parameters in template calls? Would lead to far more mistakes being made. The only symbols which cause any actual ambiguity is = and |, I'm sure we can think of a better solution than delimiting strings.

Jim Wilson

10:21 a.m.

Platonides schrieb:

...

Maybe the {{!}} template should be put into core as {{|}} Just to avoid the need of an extra template to trick the parser. Languages without escape characters are always tricky at the best.

As long as we're talking about adding core features ...

Most programming languages use backslash as a delimiter delimiter. Maybe this would work for MW as well?

So with a template having a table as an argument, you may want to do this:

{{some template| {| | hi | there |} }}

But due to pipe interpretation, you end up with this:

{{some template| {{{!}} {{!}} hi {{!}} there {{!}}} }}

With backslash, it would be this (slightly easier on the eyes):

{{some template| {| | hi | there |} }}

Just a thought. :)

-- Jim

On 7/4/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
Even better if we just had string delimiters.

For strings as parameters in template calls? Would lead to far more mistakes being made. The only symbols which cause any actual ambiguity is = and |, I'm sure we can think of a better solution than delimiting strings.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

8:16 p.m.

...

As long as we're talking about adding core features ...

Most programming languages use backslash as a delimiter delimiter. Maybe this would work for MW as well?

If you want to use \ as an escape character you have to replace all literal uses of \ with \, which destroys backwards compatibility. It might be worth it anyway (we're not going to be able to fix everything without some problems occurring somewhere), of course. It does mean that people are going to get very confused when they try and use backslashes without knowing they have to escape them.

GerardM

8:20 p.m.

Hoi, Again, even \ is used for content so it is not safe to use it. Thanks, GerardM

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
As long as we're talking about adding core features ...

Most programming languages use backslash as a delimiter

delimiter. Maybe

...
this would work for MW as well?

If you want to use \ as an escape character you have to replace all literal uses of \ with \, which destroys backwards compatibility. It might be worth it anyway (we're not going to be able to fix everything without some problems occurring somewhere), of course. It does mean that people are going to get very confused when they try and use backslashes without knowing they have to escape them.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

8:23 p.m.

...

Again, even \ is used for content so it is not safe to use it.

a literal \ would need to be replaced by \\, ie. each of the 's become \

GerardM

8:52 p.m.

Hoi, When people write a text, they should be able to use the text as it is. It is not safe to expect people to use four slashes because of your technical requirements. It may be a technical solution sure, it is not really an acceptable solution. Thanks, GerardM

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
Again, even \ is used for content so it is not safe to use it.

a literal \ would need to be replaced by \\, ie. each of the 's become \

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Platonides

9:17 p.m.

...

On 7/5/07, Thomas Dalton wrote:

...
...
Again, even \ is used for content so it is not safe to use it.

a literal \ would need to be replaced by \\, ie. each of the 's become \

That's the problem with adding escape characters. Currently the only delimiters we have is | (and [[ ]] {{ }} pairs). That's why i only talked about {{|}} which will never happen (as | is an invalid title).

However, Jim's proposal shouldn't need so if the \ is only treated as a escape character if followed by | (would still break some current code at templates and parserfunction, but much less).

Thomas Dalton

9:26 p.m.

...

However, Jim's proposal shouldn't need so if the \ is only treated as a escape character if followed by | (would still break some current code at templates and parserfunction, but much less).

How would you write it when you actually want the , then? Any time you have an escape character, you need some way to escape the escape character when you want to use it literally. You can't just use \|, since what happens if you want \ literally? If \ is going to be an escape character anywhere, it needs to be one everywhere. It's the only way that works (and, as already said, it doesn't work very well).

Platonides

6 Jul 6 Jul

1:09 a.m.

Thomas Dalton wrote:

...

...
However, Jim's proposal shouldn't need so if the \ is only treated as a escape character if followed by | (would still break some current code at templates and parserfunction, but much less).

How would you write it when you actually want the , then? Any time you have an escape character, you need some way to escape the escape character when you want to use it literally. You can't just use \|, since what happens if you want \ literally? If \ is going to be an escape character anywhere, it needs to be one everywhere. It's the only way that works (and, as already said, it doesn't work very well).

\| and \\| Or \ | if you're treating with parserfunctions.

My point is that it doesn't need to be an escape character anywhere, only if followed by a pipe (and i don't like too much the backslash being THE escape character).

Thomas Dalton

3:49 a.m.

...

...
How would you write it when you actually want the , then? Any time you have an escape character, you need some way to escape the escape character when you want to use it literally. You can't just use \|, since what happens if you want \ literally? If \ is going to be an escape character anywhere, it needs to be one everywhere. It's the only way that works (and, as already said, it doesn't work very well).

\| and \\| Or \ | if you're treating with parserfunctions.

My point is that it doesn't need to be an escape character anywhere, only if followed by a pipe (and i don't like too much the backslash being THE escape character).

Ok, let me try and get straight what you are suggesting. In the following "->" means "parses to" and "*" means a divider (ie. what | parses to without being escaped).

| -> * | -> | \| -> * \\| -> \*

Is that right? In which case, what parses to "|"? The only thing I can think of is "\|". That boils down to:

an odd number of slashes parses to half that number rounded down slashes followed by a divider an even number of slashes parses to half that number slashes followed by a pipe

I guess that might work, but it seems extremely confusing to me.

Simetrical

4:07 a.m.

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

Is that right? In which case, what parses to "|"? The only thing I can think of is "\|". That boils down to:

an odd number of slashes parses to half that number rounded down slashes followed by a divider an even number of slashes parses to half that number slashes followed by a pipe

I guess that might work, but it seems extremely confusing to me.

It's completely standard in programming languages:

...

...
...
print 'hello'

hello

...

...
...
print 'hello''

hello'

...

...
...
print 'hello\'

hello\

...

...
...
print 'hello\''

hello'

...

...
...
print 'hello\\'

hello\

Also in regex. You just have to think about like this: a backslash escapes the character after it, and is itself not displayed. When you escape a delimiter, you get a literal |. When you escape a backslash, you get a literal . So when looking at \|, you read it from left to right, character by character:

1) Backslash. Do not output this, instead escape the next character. 2) Backslash. This is escaped, so output it rather than giving it special meaning. 3) Backslash. Do not output this, instead escape the next character. 4) Vertical bar. This is escaped, so output it rather than giving it special meaning.

Whereas \| gets parsed like:

1) Backslash. Do not output this, instead escape the next character. 2) Backslash. This is escaped, so output it rather than giving it special meaning. 3) Vertical bar. Do not output this, it's a delimiter.

Thomas Dalton

4:41 a.m.

...

It's completely standard in programming languages:

It's completely standard for a backslash to escape whatever character comes after it. This suggestion is for a backslash to only escape delimiters or other backslashes only if they are followed by a delimiter. That's a significant amount of added confusion.

Simetrical

4:55 a.m.

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
It's completely standard in programming languages:

It's completely standard for a backslash to escape whatever character comes after it. This suggestion is for a backslash to only escape delimiters or other backslashes only if they are followed by a delimiter. That's a significant amount of added confusion.

I was referring to what happens when multiple consecutive backslashes are followed by an escapable character, which you characterized as confusing.

Thomas Dalton

5:34 a.m.

...

I was referring to what happens when multiple consecutive backslashes are followed by an escapable character, which you characterized as confusing.

It is confusing. We're programmers, we're used to dealing with these kinds of obscure syntaxes. Typical MediaWiki users are not. (And I still maintain that a character which acts as an escape character in some contexts and not others is even more confusing than a standard escape character.)

Jim Wilson

5:38 a.m.

...

What parses to '| ? Now you have two escape characters, which just makes it more confusing, if you ask me.

In the unlikely event that you want `|, the two strings are (depending on whether | is a literal or a param delimiter):

\`| => ` followed by a param delimiter \``| => `|

I don't think that's so bad :) Especially considering that it's an edge case.

I stand by my new (hypothetical) proposal.

-- Jim

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
I was referring to what happens when multiple consecutive backslashes are followed by an escapable character, which you characterized as confusing.

It is confusing. We're programmers, we're used to dealing with these kinds of obscure syntaxes. Typical MediaWiki users are not. (And I still maintain that a character which acts as an escape character in some contexts and not others is even more confusing than a standard escape character.)

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

5:55 a.m.

...

In the unlikely event that you want `|, the two strings are (depending on whether | is a literal or a param delimiter):

\`| => ` followed by a param delimiter \``| => `|

I don't think that's so bad :) Especially considering that it's an edge case.

I stand by my new (hypothetical) proposal.

So ` always parses to `, regardless of whether or not it is followed by a |? Ok. I still think it's too confusing for a regular user to understand.

Platonides

7 Jul 7 Jul

2:32 a.m.

This is going more and more complicated. Someone trying to EBNF wikicode will kill us for sure ;-) I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

| -> * `| -> | `<nowiki></nowiki>| -> `*

Jim Wilson

10 Jul 10 Jul

1:25 a.m.

...

I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

Good catch Platonides! The `| case looks long-ish, but I like that everything else is simpler.

-- Jim

On 7/6/07, Platonides Platonides@gmail.com wrote:

...

This is going more and more complicated. Someone trying to EBNF wikicode will kill us for sure ;-) I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

| -> * `| -> | `<nowiki></nowiki>| -> `*

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

1:32 a.m.

...

I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

Does ASCII (or unicode, I guess) have a null character? If so, it would be perfect.

Simetrical

1:48 a.m.

On 7/9/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

Does ASCII (or unicode, I guess) have a null character? If so, it would be perfect.

Uh, yes, U+0, but it's a control character and we kind of don't want it in page text. In fact I'm fairly sure we explicitly strip control characters (other than the obvious linebreaks, tabs, etc.). It tends to be invisible (GNOME at least has a very nice display for unprintable characters, but Windows doesn't) and it terminates C strings, so it's usually not allowed in various things because a) it's confusing and b) it can break things.

GerardM

2:02 a.m.

Hoi, The pipe is used in words that are considered to be English like Hai||om (a language in Namibia). When a special meaning is given to characters, there are bound to be situations where things break.. the '' is as you know another pair of characters that break standard text for the Neapolitan language.. Thanks, GerardM

On 7/9/07, Simetrical Simetrical+wikilist@gmail.com wrote:

...

On 7/9/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...
...
I just noticed we don't need to escape the escaper, only the pipe, we can separate both with null code.

Does ASCII (or unicode, I guess) have a null character? If so, it would be perfect.

Uh, yes, U+0, but it's a control character and we kind of don't want it in page text. In fact I'm fairly sure we explicitly strip control characters (other than the obvious linebreaks, tabs, etc.). It tends to be invisible (GNOME at least has a very nice display for unprintable characters, but Windows doesn't) and it terminates C strings, so it's usually not allowed in various things because a) it's confusing and b) it can break things.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

6 Jul 6 Jul

8:43 a.m.

On 7/5/07, Thomas Dalton thomas.dalton@gmail.com wrote:

...

...
I was referring to what happens when multiple consecutive backslashes are followed by an escapable character, which you characterized as confusing.

It is confusing. We're programmers, we're used to dealing with these kinds of obscure syntaxes. Typical MediaWiki users are not.

Oh, I agree with that. But then, if a user is delving deep enough into templates to have any use for escape characters, they're ipso facto more or less programmers already (at least scripters).

Thomas Dalton

5:05 p.m.

...

Oh, I agree with that. But then, if a user is delving deep enough into templates to have any use for escape characters, they're ipso facto more or less programmers already (at least scripters).

I don't know about that. It's not that unreasonable to except a non-expert to want to pass a table as a parameter to a template. Consider an infobox in an article and there are multiple things that should go under a particular heading, the user decides to put them in table and does so in the obvious way. Everything breaks. That's a bad thing. (Of course, that's how it is now, but implementing escape characters isn't really an improvement, since most users won't know how to use them anyway.)

Thomas Dalton

5 Jul 5 Jul

9:21 p.m.

...

When people write a text, they should be able to use the text as it is. It is not safe to expect people to use four slashes because of your technical requirements. It may be a technical solution sure, it is not really an acceptable solution.

That's pretty much what I said.

Tim Starling

9:34 p.m.

Jim Wilson wrote:

...

Platonides schrieb:

...
Maybe the {{!}} template should be put into core as {{|}} Just to avoid the need of an extra template to trick the parser. Languages without escape characters are always tricky at the best.

As long as we're talking about adding core features ...

Most programming languages use backslash as a delimiter delimiter. Maybe this would work for MW as well?

The WikiCreole proposal is to use ~ (tilde) as an escape character:

http://www.wikicreole.org/wiki/Creole1.0#section-Creole1.0-EscapeCharacter

Unfortunately this conflicts with our signature shortcuts.

-- Tim Starling

Platonides

6 Jul 6 Jul

1:11 a.m.

...

...
Most programming languages use backslash as a delimiter delimiter. Maybe this would work for MW as well?

The WikiCreole proposal is to use ~ (tilde) as an escape character:

http://www.wikicreole.org/wiki/Creole1.0#section-Creole1.0-EscapeCharacter

Unfortunately this conflicts with our signature shortcuts.

-- Tim Starling

Also a bad choose for escaping from URLs, as it's quite common. What about ¬ ?

Bryan Tong Minh

2:27 a.m.

On 7/5/07, Platonides Platonides@gmail.com wrote:

...

Also a bad choose for escaping from URLs, as it's quite common. What about ¬ ?

How do I type that?

Bryan

Simetrical

2:35 a.m.

On 7/5/07, Bryan Tong Minh bryan.tongminh@gmail.com wrote:

...

On 7/5/07, Platonides Platonides@gmail.com wrote:

...
Also a bad choose for escaping from URLs, as it's quite common. What about ¬ ?

How do I type that?

Depends on your keyboard setup and possibly operating system. In Windows it's typically Alt-0172, numbers from the Num Pad with Num Lock on (or maybe it's off, I forget). In Gnome you do Ctrl-Shift-uAC. There are probably, however, keyboard setups for which it's possible to get it by just hitting a key normally.

Platonides

3:31 a.m.

Bryan Tong Minh wrote:

...

On 7/5/07, Platonides wrote:

...
Also a bad choose for escaping from URLs, as it's quite common. What about ¬ ?

How do I type that?

Seeing your answers, i guess it's not on en-us keyboards. Then disregard. It's near 6 on mine, just as simple as type ~

Thomas Dalton

3:43 a.m.

...

...
...
Also a bad choose for escaping from URLs, as it's quite common. What about ¬ ?

How do I type that?

Seeing your answers, i guess it's not on en-us keyboards. Then disregard. It's near 6 on mine, just as simple as type ~

On my keyboard ¬ is Shift-`, it's the key next to the 1. It has `, ¬ and (with Alt Gr) ¦ on it.

Simetrical

2:27 a.m.

On 7/5/07, Platonides Platonides@gmail.com wrote:

...

What about ¬ ?

Not part of ASCII, therefore difficult to type for most. Not really acceptable.

Michael Daly

4:14 a.m.

New subject: [SPAM] Re: Template ideal behavior

Platonides wrote:

...

What about ¬ ?

I haven't seen one of those since the days of the 029 keypunch!

Mike

Jim Wilson

4:52 a.m.

New subject: [SPAM] Re: Template ideal behavior

Tim has a good point - there needs to be a way to put in a literal delimiter delimiter (). And, it should be as BC as possible.

So I'd like to revise my suggestion (which is only a hypothetical musing anyway)

What about using backtick (`) with a literal backtick being backslash backtick (`)

So:

| => param delimiter (as it is today) `| => literal pipe `a => `a (when anything other than a pipe follows a backtick, keep the backtick and the char) `` => `` (same as above, effectively) `| => ` followed by param delimiter \a => \a (backslash only has special meaning when preceding a backtick) \ => \ (same as above)

The advantages to this proposal is that it will only break BC for cases where a backtick immediatly precedes a pipe (`|) or when a backslash immediately precedes a backtick (`). I suspect these two constructions are rare - though I have no statistical evidence to support this claim. :)

-- Jim

On 7/5/07, Michael Daly michaeldaly@kayakwiki.org wrote:

...

Platonides wrote:

...
What about ¬ ?

I haven't seen one of those since the days of the 029 keypunch!

Mike

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Dalton

5:32 a.m.

New subject: [SPAM] Re: Template ideal behavior

...

| => param delimiter (as it is today) `| => literal pipe `a => `a (when anything other than a pipe follows a backtick, keep the backtick and the char) `` => `` (same as above, effectively) `| => ` followed by param delimiter \a => \a (backslash only has special meaning when preceding a backtick) \ => \ (same as above)

What parses to '| ? Now you have two escape characters, which just makes it more confusing, if you ask me.

Mark Clements

6:12 a.m.

New subject: [SPAM] Re: Template ideal behavior

"Thomas Dalton" thomas.dalton@gmail.com wrote in message news:a4359dff0707051702t21a0b072w519cd99c5aafefda@mail.gmail.com...

...

...
| => param delimiter (as it is today) `| => literal pipe `a => `a (when anything other than a pipe follows a backtick, keep the backtick and the char) `` => `` (same as above, effectively) `| => ` followed by param delimiter \a => \a (backslash only has special meaning when preceding a backtick) \ => \ (same as above)

What parses to '| ? Now you have two escape characters, which just makes it more confusing, if you ask me.

\''|

...I think!

* Backslash not followed by tick: backslash * Backslash followed by tick: tick * Tick followed by pipe: literal pipe symbol

- Mark Clements (HappyDog)

Mark Clements

4 Jul 4 Jul

6:49 p.m.

"Jim Wilson" wilson.jim.r@gmail.com wrote in message news:ac08e8d0707032055p558d7747rfcf1afdb1937792e@mail.gmail.com... [SNIP]

...

I shudder to continue because such a list can get very long very fast (I

had

...

to stop myself, for example, from making a level under Links for Images

and

...

Categories). But you get the idea. IMO, a partial spec in any of the

above

...

areas would be better than none at all (and in several of those cases, a spec may already exist)

I remember seeing a proposal for modifying the link/inclusion syntax so that [[ ]] was always a link to a page and {{ }} was always an inclusion.

Therefore to link to an image description page you would use [[Image:foo]], and to include an image on a page you would use {{Image:foo}}. It also dropped the single/double square-bracket distinction between internal and external links (all links used same method), and made a few other suggestions for improvements.

I can't remember where the page is, and couldn't find it on Google, so if anyone else remembers please chime in. On the whole I remember the proposal being, on the whole, very good except for that one little sticking point: backward compatibility ;-).

However, if you're not worried about backward compatibility, then I suggest you check this out as a first point of call.

- Mark Clements (HappyDog)

Platonides

5 Jul 5 Jul

5:23 p.m.

It was on this list, in a subthread [1] of "Thoughts on Image:, Media: and Download:" [2] However, now it's too late.

1-http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/31235 2-http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/31229

Mark Clements

8:07 p.m.

"Platonides" Platonides@gmail.com wrote in message news:f6im2o$e7g$1@sea.gmane.org...

...

It was on this list, in a subthread [1] of "Thoughts on Image:, Media: and Download:" [2] However, now it's too late.

1-http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/3123 5

...

2-http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/3122 9

Actually, Christian S. Neubauer found the link and e-mailed it to me.

Here it is: http://meta.wikimedia.org/wiki/Wikimark2. It's not all relevant, and some parts I disagree with, but the idea of standardising the meaning of the link/inclusion syntax is good.

- Mark Clements (HappyDog)

Brion Vibber

3 Jul 3 Jul

10:15 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Jim Wilson wrote:

...

Any advice on how to handle these edge cases (nested brackets beyond 2 or 3) would be much appreciated.

Ewwwww :)

Generally these sort of things get handled by what closes first, with longest match first.

So for quadruple {{{{a}}}}

we have: - - starting sequence string: {{{{ - - text: a - - found possible ending sequence: }}} - - found starting match: {{{, leaving { prefix - - found leftover }

Thus splitting up into: - - text: { - - template param start sequence: {{{ - text: a - template param end sequence: }}} - - text: }

I dunno if that always gives the intuitively "right" answer, though.

- -- brion vibber (brion @ wikimedia.org)

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGin0QwRnhpk1wk44RAnHXAKCNhOBE0a2zPk5gE9h75eMzQl/FggCfRUak L7zNe8dcFfsNgNXYxtDwzXA= =6Pxs -----END PGP SIGNATURE-----

6328

Age (days ago)

6334

Last active (days ago)

wikitech-l@lists.wikimedia.org

55 comments

14 participants

tags (0)

participants (14)

Brion Vibber
Bryan Tong Minh
Daniel Cannon
GerardM
Jim Wilson
Mark Clements
Michael Daly
Minute Electron
Phil Boswell
Platonides
Rob Church
Simetrical
Thomas Dalton
Tim Starling