Re: [Wikitech-l] Parser practicum

14 Nov 2007

Steve Bennett wrote:
...
  On 11/14/07, Brion Vibber &lt;brion(a)wikimedia.org&gt;
wrote:
  I would recommend against considering this at
this time (if ever).

 Hopping around changing basic syntax is probably not the thing to do
 when in the middle of changing the parser mechanics.

  I would say this:

 Some text with '''bold''' and some ''italics''
and even some '''''bold
 italics'''''.

 is basic syntax. We're not changing that.

 This:

 Some text with '''''bold italics''' then just
italics''.  Oh and I did I
 mention ''''bold preceded by apostrophes'''' and who
knows, some
 ''''''random''''
combinations''''' of '''' apostrophes
''''''''' and
 bold/italics''' that noone ''''' can
''''''''''predict the behaviour
 '''of'...

 is not basic syntax. It can't be EBNF'ed. It can't be translated exactly
 according to the whims of the current parser. 
Note that EBNF is not necessarily desired or desirable; if EBNF can't
describe the grammar of the language, then it's not a suitable tool.

Note also that it *is* a requirement to have sane behavior with this
sort of construction:

L'''idée'' <- apostrophe followed by italics
L''''idée''' <- apostrophe followed by bold

That's a *requirement* to continue to properly handle French and Italian
text. The current apostrophe pass handler uses I believe a lookahead and
then goes backwards, which is a fairly sane way of doing this. If EBNF
can't handle it, then forget EBNF.

...
  I could accept that the first
 sentence of my second part is "basic syntax". But not this kind of madness:
 # If there is an odd number of both bold and italics, it is likely
 # that one of the bold ones was meant to be an apostrophe followed
 # by italics. Which one we cannot know for certain, but it is more
 # likely to be one that has a single-letter word before it.

 That's why we're proposing *adding* ** and //, to provide alternative
 mechanisms for these complicated situations. 
Let me be very very clear here.

Whether or not we ever add ** and // as bold and italic syntax is
completely unrelated to the actual task of rebuilding the parser or
speccing out a grammar for the wiki syntax.

If you want to play with alternate syntax (adding different markup such
as "**" or "//" or "$*^#&*^"), feel free to do so on
your own, but
please don't mix it into any discussion or work or planning or
decision-making about the parser.

New alternates aren't even needed; old alternates already exist (<b> and
<i>; use of <nowiki></nowiki> as a hidden separator, etc). Other sorts
of magic characters might also be neat additions, but they should not be
considered at this time because it's just going to sidetrack things.

Don't complicate the situation by tossing in new stuff. Then the
conversation goes from something manageable (does this proposed parser
technically accomplish the job?) to something unmanageable (should we
make a large number of changes to markup?) and we'll never get anywhere.

-- brion vibber (brion @ wikimedia.org)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Parser practicum