In any case, Eric, have you thought of any ideas to simplify the syntax while maintaining reverse-compatibility (except perhaps in highly unlikely corner cases)? For instance, what if "==Text" translated to "<h2>Text</h2>", while "==Text==" still did as well? That would eliminate the need for unlimited lookahead for headings, reducing it to one-character lookahead for the first five levels and zero-character for the sixth. In fact, it would probably cause little breakage if the same were done with opening wikilinks, template calls, and so on. Any other thoughts?
The context-sensitivity of apostrophes probably isn't avoidable, unfortunately.
On 8/17/06, Jay R. Ashworth jra@baylink.com wrote:
It wouldn't help. The problems are semantic, not syntactical.
I think. :-)
Nope, we're talking 100% syntax. '''hi''hello'''hi'''hello''hi''' isn't ambiguous if it's stored as <b>hi<i>hello</i></b><i>hi<b>hello</b></i><b>hi</b> to begin with, and neither do you need unlimited lookahead to know that <h2> is a header tag and not a literal string. Every problem Eric is having would be eliminated if we switched to XML for internal storage, because all Eric is doing is trying to write a formal grammar — and a formal XML grammar is part of the official XML specification.