Table rows:
{| |- |You parse and parse and parse and read and read and you have no idea whether this is a table cell or a style property for the cell, until you hit either a | or a ||. Oops, it was just a style property, better go back and parse it again. |}
That's kind of evil. For big table rows, that could get very expensive to parse.
Steve
Have you also noticed the evil trim() in doTableStuff() ? Right before it starts to parse the line. Means have to lookahead if a line starts with whitespace or NUL.
Jared
-----Original Message----- From: wikitext-l-bounces@lists.wikimedia.org [mailto:wikitext-l-bounces@lists.wikimedia.org] On Behalf Of Steve Bennett Sent: 04 February 2008 12:11 To: Wikitext-l Subject: [Wikitext-l] So, the hardest wikitext construct to parse?
Table rows:
{| |- |You parse and parse and parse and read and read and you have no idea whether this is a table cell or a style property for the cell, until you hit either a | or a ||. Oops, it was just a style property, better go back and parse it again. |}
That's kind of evil. For big table rows, that could get very expensive to parse.
Steve
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitext-l
On 2/4/08, Jared Williams jared.williams1@ntlworld.com wrote:
Have you also noticed the evil trim() in doTableStuff() ? Right before it starts to parse the line. Means have to lookahead if a line starts with whitespace or NUL.
Ouch, that's almost obnoxious. I just tried it, and sure enough, this kind of thing works:
{| |- | fooooo |}
This looks like an example where being too permissive is actually harmful. There's no real benefit in being able to left-indent the table and no one does it.
Another interesting aspect of table parsing that I've noticed is that malformed tables often disappear, rather than being rendered literally. I think we decided that a replacement parser doesn't have to mimic the current one on malformed input but there are still issues to consider...
Steve
On Feb 4, 2008 1:02 PM, Steve Bennett stevagewp@gmail.com wrote:
On 2/4/08, Jared Williams jared.williams1@ntlworld.com wrote:
Have you also noticed the evil trim() in doTableStuff() ? Right before it starts to parse the line. Means have to lookahead if a line starts with whitespace or NUL.
Ouch, that's almost obnoxious. I just tried it, and sure enough, this kind of thing works:
{| |- | fooooo |}This looks like an example where being too permissive is actually harmful. There's no real benefit in being able to left-indent the table and no one does it.
At least that peculiar bit wasn't my fault, IIRC;-) I was quite surprised when I saw indented cells one day on wikipedia. Apparently, it is used like this: {| | because the "|" lines up nicely this way in edit mode...
Magnus
Another little feature, is the table row token is the regexp |-+
So
{| |------------------------- | foo |}
{| |- | food |}
are equivalent
Jared
-----Original Message----- From: wikitext-l-bounces@lists.wikimedia.org [mailto:wikitext-l-bounces@lists.wikimedia.org] On Behalf Of Steve Bennett Sent: 04 February 2008 13:02 To: Wikitext-l Subject: Re: [Wikitext-l] So, the hardest wikitext construct to parse?
On 2/4/08, Jared Williams jared.williams1@ntlworld.com wrote:
Have you also noticed the evil trim() in doTableStuff() ? Right before it starts to parse the line. Means have to
lookahead if a
line starts with whitespace or NUL.
Ouch, that's almost obnoxious. I just tried it, and sure enough, this kind of thing works:
{| |- | fooooo |}This looks like an example where being too permissive is actually harmful. There's no real benefit in being able to left-indent the table and no one does it.
Another interesting aspect of table parsing that I've noticed is that malformed tables often disappear, rather than being rendered literally. I think we decided that a replacement parser doesn't have to mimic the current one on malformed input but there are still issues to consider...
Steve
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitext-l
On 06/02/2008, Jared Williams jared.williams1@ntlworld.com wrote:
Another little feature, is the table row token is the regexp |-+
And don't forget row styling:
|-style="background:orange"
(which I didn't know existed until I saw someone using it on our intranet today)
- d.
On 2/7/08, David Gerard dgerard@gmail.com wrote:
On 06/02/2008, Jared Williams jared.williams1@ntlworld.com wrote:
Another little feature, is the table row token is the regexp |-+
And don't forget row styling:
|-style="background:orange"
(which I didn't know existed until I saw someone using it on our intranet today)
Yeah, that syntax is actually fine - there's no ambiguity at all. What I think happens is the parser gets to |- and parses the rest of the line as an XHTML row attribute. If it's junk (eg, ----------------------), then it's just stripped out.
There are lots of rather useless possibilities which are permitted, however:
{| |-style="background: blue" |-row-span=3 |foo... |}
The first "style" row is totally ignored.
Steve
Yeah, that syntax is actually fine - there's no ambiguity at all. What I think happens is the parser gets to |- and parses the rest of the line as an XHTML row attribute. If it's junk (eg, ----------------------), then it's just stripped out.
There is a preg_replace that removes the token from the beginning of the line
$line = preg_replace( '#^|-+#', '', $line );
Jared
On 07/02/2008, Steve Bennett stevagewp@gmail.com wrote:
There are lots of rather useless possibilities which are permitted, however:
{| |-style="background: blue" |-row-span=3 |foo... |}
The first "style" row is totally ignored.
Rilly? That's arguably a bug in the present parser and should be filed as one.
- d.
On Thu, Feb 07, 2008 at 10:09:47AM +0000, David Gerard wrote:
On 07/02/2008, Steve Bennett stevagewp@gmail.com wrote:
There are lots of rather useless possibilities which are permitted, however:
{| |-style="background: blue" |-row-span=3 |foo... |}
The first "style" row is totally ignored.
Rilly? That's arguably a bug in the present parser and should be filed as one.
It's not a bug. The style applies to a row which is never rendered because it has no data in it. IMO.
Cheers, -- jra
On 07/02/2008, Jay R. Ashworth jra@baylink.com wrote:
On Thu, Feb 07, 2008 at 10:09:47AM +0000, David Gerard wrote:
On 07/02/2008, Steve Bennett stevagewp@gmail.com wrote:
There are lots of rather useless possibilities which are permitted, however:
{| |-style="background: blue" |-row-span=3 |foo... |} The first "style" row is totally ignored.
Rilly? That's arguably a bug in the present parser and should be filed as one.
It's not a bug. The style applies to a row which is never rendered because it has no data in it. IMO.
ah, d'oh! Yes.
- d.
On 2/8/08, Jay R. Ashworth jra@baylink.com wrote:
It's not a bug. The style applies to a row which is never rendered because it has no data in it. IMO.
This is the unfortunate kind of reasoning that arises from having syntax which is a thin layer on top of someone else's syntax (HTML).
So, we treat this:
{| |- |- |foo |}
as valid syntax not because it's useful or meaningful, but because it produces syntactically valid HTML. That goes pretty much against the philosophy of most other aspects of wikitext: we generally go with what's useful and meaningful.
Well, I can't criticise too much. It does work, after all :)
Steve
Steve Bennett wrote:
I have successfully parsed my first nested table. It's 3 in the morning but I'm quite happy :)
Congratulations! :D
Steve Bennett wrote:
So, we treat this as valid syntax not because it's useful or meaningful, but because it produces syntactically valid HTML. That goes pretty much against the philosophy of most other aspects of wikitext: we generally go with what's useful and meaningful.
Well, I can't criticise too much. It does work, after all :)
Steve
Just like any other language.
int foo(int number) { int i,j=0; while (3*14==72) { i++; while (number > 0) { for (i=0; i > 85; i++) { number++; }
number = number - ++i; j += 1; } while (number > 0);
return j; }
It's a perfectly valid C function. However, it's crap*. The compiler will need to optimize it removing all the useless stuff and may as well throw a bunch of warnings, but the syntax is right. Useless but valid. Note that I'm partidary of wikitext warnings.
*I'll refrain from giving here the equivalent oneliner ;)
Just like any other language.
int foo(int number) { int i,j=0; while (3*14==72) { i++; while (number > 0) { for (i=0; i > 85; i++) { number++; }
number = number - ++i; j += 1;} while (number > 0);
return j; }
It's a perfectly valid C function.
Is it? I count 4 {'s and only 3 }'s... ;)
Thomas Dalton wrote:
Just like any other language.
int foo(int number) { int i,j=0; while (3*14==72) { i++; while (number > 0) { for (i=0; i > 85; i++) { number++; }
number = number - ++i; j += 1;} while (number > 0);
return j; }
It's a perfectly valid C function.
Is it? I count 4 {'s and only 3 }'s... ;)
That's er... a protection against illegal copies :P
Now, if you add a }at the end it *does* compile, this time i'have checked it. Hey, with a int main(){} it'd even link!
wikitext-l@lists.wikimedia.org