Re: [Wikitech-l] EBNF grammar project status?

9 Nov 2007


      On 11/9/07, Simetrical Simetrical+wikilist@gmail.com wrote:
...
I suspect a major problem might arise if there are constructs that
require more than one-token lookahead.  There probably are, and
apparently bison et al. can't parse those.  But again, I would defer
I think it would be a good idea to formalise and improve the grammar
so that wasn't the case. Does any sane grammar need more than one
token look ahead?
# If there is an odd number of both bold and italics, it is likely
...
# that one of the bold ones was meant to be an apostrophe followed
# by italics. Which one we cannot know for certain, but it is more
# likely to be one that has a single-letter word before it.
This is a good example. There is no grammar, therefore no spec,
therefore the parser can do whatever it wants. However it tries to
guess.
No one has ever really defined the answer to the question: What is
represented by the following string: '''''
There are many answers to the question, depending on the context. It's
horrible. It shouldn't be like that. There are solutions:
- Distinct sequences for italics and bold (**this** being the obvious
choice for bold)
- Specific tokens for bold, italics, and bold-italics, so that this:
'''''Some''' word'' is no longer valid. Instead you would write
'''''Some''''' ''word''.
- Strong escaping mechanisms such that the parser deliberately gives up very
early on, and if you want bold-italic apostrophes, you're going to have to
escape them. Making ''''''foo'''''' deliberately render as bold-italics
'foo' is madness*. Cute for a lolcode or Intercal, but for MediaWiki?
Steve
* Well, it would be logical
madness if it actually rendered like that. For some reason the first
apostrophe renders as neither bold nor italics. So it's illogical
madness :)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] EBNF grammar project status?