On 12/11/07, David Gerard <dgerard(a)gmail.com> wrote:
I'm about
to head off for a week and a half, so here's a quick
progress stop. My ANTLR grammar so far is here:
http://www.mediawiki.org/wiki/User:Stevage/ANTLR
It does many features, but most aren't really complete.
I offer this up just for curiosity's sake - no one should try and hack on it ;)
[hrm, on closer inspection, that's not the latest version of that
file. oh well.]
You should link the above from the ANTLR page and include this email
at the top of it.
It's a wiki isn't it? Feel free. :)
This is still very much work in progress and hasn't been tidied up at
all. I would be interested to hear whether anyone finds this ANTLR
grammar readable and meaningful at all. If the grammar is not
expressive and readable, there's not much point having it.
I'm especially troubled by the syntactic predicates which seem to be
required to suppress warnings by the ANTLR compiler. These are the
ones that look like:
rule:
(option1) => option1
| (option2) => option2;
Most of the time this behaves exactly the same as:
rule:
option1
| option2;
but if option1 and option2 can match the same input, then ANTLR will
generate a warning if the syntactic predicates aren't there. However,
with the syntactic predicates it ends up parsing the text twice (I
think) - once to check whether the predicate will succeed, then once
for real. It's a pretty annoying trade-off: readability and
performance vs no warnings and certainty of execution path.
I'm also a bit concerned about the eventual performance of this thing.
Already parsing a page of wikitext seems to take a very, very long
time (eg, 10 seconds), but I don't know how much of that is caused by
the environment (Java JVM), the debugger etc. And of course my grammar
is pretty inefficient in many ways.
Steve