Re: [Wikitext-l] What needs to be done/can be done for the ANTLR grammar?

24 Jan 2008


      On 1/23/08, David Gerard dgerard@gmail.com wrote:
...
Steve (and others): What needs to be done for the ANTLR grammar that
can be parallelised, so that the many people desperately after
reliable independent parsing of wikitext can contribute to the effort?
I can currently see two relatively independent tasks that are required here:
1) Analysis of wikitext, understanding how the current parser works,
negotiation over what features are required, how borderline features
should operate etc.
2) Production of a useful, functional, efficient, readable etc ANTLR grammar.
I was doing well on 1. I've gotten bogged down in 2.
Suggestions for ways to help
- recruit an ANTLR expert who could help fix my grammar, clean it up,
make it readable
- people to add some of the still-missing features (notably tables and
HTML tags, also <refs> but I'm not sure where they're best handled)
- general assistance expanding out the various features
- assistance with some of the nitty gritty like character classes and
such, which I haven't really delved into (precise definitions of
letter, punctuation etc that work for all languages...)
...
Also: how to speed up ANTLR-generated PHP, so this has half a chance
of being implemented?
Ahem. There is no such thing as ANTL-generated PHP. So in order for
there be a quarter of a chance of such a thing being implemented,
someone would need to write the PHP target for ANTLR.
Based on my experience so far, I really don't like the chances of
simply generating PHP out of the box with ANTLR and dropping it in.
The (java) code that's being generated so far is humongous and has a
lot of problems. ANTLR has problems, bugs, unpleasant behaviour etc.
However. We do need a spec. And I don't know of a better way to
specify the wikitext language than ANTLR. Whether or not spec can
automagically generate a working parser is sort of a separate
question...I think. But opinions are welcome.
Steve

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [Wikitext-l] What needs to be done/can be done for the ANTLR grammar?