Re: [Wikitech-l] WYSIWYG (or WYSIWYM or WYSIWYM) status?

12 Feb 2007


      Jared Williams wrote:
...
...
Just one example - probably of the 5% very hard category:
'''''hello''' hi''
vs.
'''''hi'' hello'''
Rendered in HTML, the first reads <i><b>hello</b> hi</i>, and 
the second 
reads <b><i>hi</i> hello</b>. The problem is that the meaning of the 
first 5 quotes changes based on the order in which the bold 
and italic 
regions close - which is not determined while scanning left-to-right.
Another example:
'''hello ''hi''' there''
MediaWiki renders this as <b>hello <i>hi</i></b><i> 
there</i>, properly 
handling overlapping formatting.
There are ways to deal with these... putting off the 
resolution until a 
later pass is the only way I know of that deals with the 
first one, and 
it's a bit touchy. Manageable, but touchy.
Think the easiest method (and nearer to be able to keep as it a single pass)
is to use DOM. Guarentees valid XML output always, which I believe the
MediaWiki parser doesn't always do.
Also can easly going back and fixing up the DOM tree, if the parser has made
an initial wrong choice. Like
'''italics''
It might start out as <b>italics</b>, but seeing '' its can be corrected to
'<i>italics</i>.
Jared
Keeping an abstract tree as an intermediate representation helps, but 
does not fix, this problem. Dealing with things like '''italics'' is 
non-trivial in any case, as if we're going to retain this behavior, no 
context-free grammar (at least with fixed lookahead) can possibly suffice.
Whatever happens to handle this, it will have to be at a separate stage 
from the original parsing. What remains is a question of how many extra 
stages we will need.
- Eric

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] WYSIWYG (or WYSIWYM or WYSIWYM) status?