Hello,
There is a bad bug currently with templates : http://sourceforge.net/tracker/?func=detail&atid=411192&aid=965725&a...
Basicly, we use | as a delimiter between parameters which prevent user to rename links ( [[foo|blabla]] ) :
{{template|para1=[[foo|bla]]|para2=oooo}}
Wich will not be parsed correctly.
In the bug report, a proposal is to use |CR instead :
{{template para1=[[bla|something]]| para2=ooo| }}
The parser will then be much easier and the |CR doesn't conflict with anything.
The fix in the code seems trivial but users will have to fix the already existing templates (but he we are still in beta).
There is a bad bug currently with templates : http://sourceforge.net/tracker/?func=detail&atid=411192&aid=965725&a...
...
{{template para1=[[bla|something]]| para2=ooo| }}
The parser will then be much easier and the |CR doesn't conflict with anything.
I've replied at Sourceforge already, but to reiterate: I think enforcing <CR> is a bad idea, because it forces layout which may not be appropriate in all circumstances. Better to use something equally unique, but less obtrusive, such as "||":
{{template || para1=[[bla|something]] || para2=ooo }}
Rowan Collins wrote:
I've replied at Sourceforge already, but to reiterate: I think enforcing <CR> is a bad idea, because it forces layout which may not be appropriate in all circumstances. Better to use something equally unique, but less obtrusive, such as "||":
{{template || para1=[[bla|something]] || para2=ooo }}
Neither of your suggestions would fix something something like this:
{{template1||para1={{template2||para=hah, you're screwed!}}}}
(or would it?)
Better to fix the parser properly...
Timwi
--- Rowan Collins rowan.collins@gmail.com wrote:
There is a bad bug currently with templates : http://sourceforge.net/tracker/?func=detail&atid=411192&aid=965725&a...
I've replied at Sourceforge already, but to reiterate: I think enforcing <CR> is a bad idea, because it forces layout which may not be appropriate in all circumstances. Better to use something equally unique, but less obtrusive, such as "||":
{{template || para1=[[bla|something]] || para2=ooo }}
Sorry if this is obvious, but maybe something could be learned by the way the parser handles wikilinks in image captions. For example, wikilinks with alt. titles work fine, even though image properties are separated by pipes:
[[Image:George-Washington.jpg|thumb|200px|[[George Washington|Washington]] was...]]
-- David Iberri
On Wed, 28 Jul 2004 10:41:18 -0700 (PDT), David Iberri diberri@yahoo.com wrote:
Sorry if this is obvious, but maybe something could be learned by the way the parser handles wikilinks in image captions. For example, wikilinks with alt. titles work fine, even though image properties are separated by pipes
So they do... Question is, does anyone know how that *does* work? [Parser.php is more than a little, um, opaque...]
On Wed, 2004-07-28 at 23:30 +0100, Timwi wrote:
Yes, by calling replaceInternalLinks twice ;-)
Ouch.
Not necessarily- it's more or less the fastest way to do this with a regex-based parser like the current MediaWiki one. The second call ist not expensive. Searches the entire text though..
I think this is one of the bugs with the highest priority, people are using all sorts of hacks to get around it currently:
1. using things like {| {{infobox_args}} ... in tables 2. Just making incorrect links and creating redirects for them 3. Manually creating thumbnails to use as images.
Isnt this just the kind of parser in use currently, it is possible to do something like this in the bash shell for example: $ echo $(($((2+2))+1+$(echo 1))) 6
On Thu, 29 Jul 2004 01:21:09 +0200, Gabriel Wicke lists@wikidev.net wrote:
On Wed, 2004-07-28 at 23:30 +0100, Timwi wrote:
Yes, by calling replaceInternalLinks twice ;-)
Ouch.
Not necessarily- it's more or less the fastest way to do this with a regex-based parser like the current MediaWiki one. The second call ist not expensive. Searches the entire text though.. -- Gabriel Wicke
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Wed, 28 Jul 2004 20:52:24 +0100, Rowan Collins rowan.collins@gmail.com wrote:
So they do... Question is, does anyone know how that *does* work? [Parser.php is more than a little, um, opaque...]
I haven't looked at it (and don't have time at this exact moment) but I imagine it's something like this: Keep track of how deep you are nested within the [['s, and treat the | characters differently according to context. If you're at the top level, a | character is a delimiter for that level, but if you're already nested inside another [[ level (i.e. you're inside a link) then treat the | character according to the piped-link interpretation. Look for ]]'s to know when you should pop back out to the previous level/context.
-Bill Clark
Bill Clark wrote:
On Wed, 28 Jul 2004 20:52:24 +0100, Rowan Collins rowan.collins@gmail.com wrote:
So they do... Question is, does anyone know how that *does* work? [Parser.php is more than a little, um, opaque...]
I haven't looked at it (and don't have time at this exact moment) but I imagine it's something like this: Keep track of how deep you are nested within the [['s, and treat the | characters differently according to context. If you're at the top level, a | character is a delimiter for that level, but if you're already nested inside another [[ level (i.e. you're inside a link) then treat the | character according to the piped-link interpretation. Look for ]]'s to know when you should pop back out to the previous level/context.
-Bill Clark
Hello,
That will be the way to do it using a tokenizer. Unfortunatly the tokenizer is disabled for performance issue.
So in other words the code exists but is not currently enabled?
On Thu, 29 Jul 2004 19:52:39 +0200, Ashar Voultoiz thoane@altern.org wrote:
Bill Clark wrote:
On Wed, 28 Jul 2004 20:52:24 +0100, Rowan Collins rowan.collins@gmail.com wrote:
So they do... Question is, does anyone know how that *does* work? [Parser.php is more than a little, um, opaque...]
I haven't looked at it (and don't have time at this exact moment) but I imagine it's something like this: Keep track of how deep you are nested within the [['s, and treat the | characters differently according to context. If you're at the top level, a | character is a delimiter for that level, but if you're already nested inside another [[ level (i.e. you're inside a link) then treat the | character according to the piped-link interpretation. Look for ]]'s to know when you should pop back out to the previous level/context.
-Bill Clark
Hello,
That will be the way to do it using a tokenizer. Unfortunatly the tokenizer is disabled for performance issue.
-- Ashar Voultoiz
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Thu, 2004-07-29 at 19:59 +0000, Ævar Arnfjörð Bjarmason wrote:
So in other words the code exists but is not currently enabled?
The code exists in php, but is a few times slower than the current parser. That tokenizer also handles only a few of [count > 20] passes.
I'm currently writing a new parser using BisonGen (builds both a C python module and a pure python parser) that handles the entire parsing in one step. The C version also performs very well (0.014 seconds vs. 0.17 seconds for the pure python version). The output will be a DOM object tree, includes and the like will be handled by manipulating that tree before dumping it as [insert your favourite format here]. Where feasible, this parser also supports the current Moin syntax additional to the MW one, it's intended to work with Moin of course (which has a relatively clean design and profits from the python infrastructure). Some more details at http://moinmoin.wikiwikiweb.de/NewWikiParser.
Gabriel Wicke wrote:
I'm currently writing a new parser using BisonGen (builds both a C python module and a pure python parser) that handles the entire parsing in one step.
Oooh. Very good. I was wondering when someone would start doing this and whether I would have to do it.
I would like to know -- how are you going to describe the syntax? Is it going to be extensible enough to allow for new syntax elements later (of if a bug creeps into the grammar, will it be easy to fix without re-writing the entire parser)?
Thanks! Timwi
wikitech-l@lists.wikimedia.org