On 11/17/07, Steve Bennett <stevagewp@gmail.com> wrote:
The problem I have here is the options for the image: you'd like the word "thumbnail" to be a token, but then if you get a case like:

 [[image:finger.jpg|Note the impressive thumbnails.]] 

you get one token for "thumbnail" rather than "t" and "h" etc.

Solutions I can think of so far:
1) Explicitly make the match for text to be 'a'..'z' | 'A'..'Z' | MW_img_thumbnail | ...
2) Make tokens for individual letters (Aa, Bb...) then make the parser recognise a pattern like Tt + Hh + Uu + Mm...
3) Make a token which is '|thumbnail', then use some trick to distinguish '|thumbnailblah' from '|thumbnail|'.
4) Like 1), but use a localised lexer so that those words are only tokens in this specific context.
5) Just match text, then use special markup at the parser level to look into the text that was matched.


Omg it's so much easier than that.
6) Use a syntactic predicate:

option : (magicword '|') => magicword
| caption;

magicword
: 'magicword';

Translation: If the next two tokens are some magicword and the pipe, then match the magic word. Otherwise, treat it as a caption.

That was easy. Woot. I thought things were a lot more complicated because ANTLRWorks sneakily doesn't support predicates in its Interpreter mode, only in its Debugger mode. I say "sneakily" because the error it reports looks like an error in your code...

>But: if it can produce a parser in *any* langauge, then we have
>something to run the test suite against, with a little harness
>rewiring, which makes it easier to sell both the retargeting work and
>the switch-MW-to-this work.

Oh, that's a good benefit too: we can regression test the new *grammar* against the old *parser*. Obviously it won't all work, and will require hacks to get all those magic words and stuff into the grammar. Perhaps someone could look into creating some tests that don't require the preprocessor (no templates, no magic variables) and that focus on specific language features...or maybe they already exist, I haven't looked.

Steve