Re: [Wikitech-l] Parsing italics/bold

13 Nov 2007

On 11/13/07, Steve Bennett &lt;stevagewp(a)gmail.com&gt; wrote:
...

 What's the best way to approach parsing a long string of formatted text:

1) Treat each incidence of ''' or '' as an element to be translated
into
 , , , or , using state ("context"?) to
determine which
 2) Have a rule that treats an entire run of '''........''' as a
single
 element, to be transformed into ........

 To answer my own question, I don't think 2) is possible, due to the
legitimacy of constructs like:

Here is some ''italics with a [[link|that switches ''off]] the italics.

I think '' and ''' will have to be parsed as rather ambiguous
"toggle state
of bold/italics" tokens, whose meaning can be made more clear by walking the
AST afterwards.

It's a pity, because the existing work on the EBNF assumed that they could
be treated as blocks. http://www.mediawiki.org/wiki/Markup_spec (was at
meta)

Unless someone wants to jump in and claim that the above construct is a
mistake and that ''..'' *should* be a block of some kind.

Steve
PS http://www.usemod.com/cgi-bin/mb.pl?ConsumeParseRenderVsMatchTransform is
useful for describing the parser transfomation we're trying to achieve.
Apparently we're trying to convert a "match-transform" parser into a
"consume-parse-render" parser.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Parsing italics/bold