Re: [Wikitech-l] EBNF grammar project status?

9 Nov 2007


      On 11/8/07, Steve Sanbeg ssanbeg@ask.com wrote:
...
I think that's true, if you tokenize correctly, that would go a long way.
Unfortunately, there are a few constructs that make tokenization tricky.
Apostrophe is the most obvious case; but {'s, and to a lesser extent ['s
could have similar problems, since they would require substantial
lookahead to tokenize.
According to flex documentation, it's perfectly happy to accept any
regex for tokens, and will use unlimited lookahead and backtracking if
necessary.  It provides debug info allowing you to check for and
eliminate backtracking, if you want to speed it up, but that's
optional.  Clearly it's not possible to tokenize MW markup with
one-character lookahead: you sure can't tell the difference between a
second- and sixth-level heading, and of course that's even ignoring
stuff like ISBN handling that's less basic and more disposable.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] EBNF grammar project status?