What I'm suggesting is simply storing it in the
tree as a toggle - neither
"bold on" nor "bold off", but just "bold toggle". Then a
secondary stage
walks the tree and matches them. And obviously in the walking it would only
walk within a paragraph block.
But that wouldn't be a tree. There is no way of storing toggles in a
tree, at least not conceptually. You would end up with something like
this:
....................wikitext
......___________|__________
.....|...................|..................|
....B................text...............B
Where "B" means a bold toggle, and "text" is arbitrary text. Things
at
the same level of a tree shouldn't depend on each other, that's how
trees work (and is why you can use CSS to move HTML div tags anywhere
you like on the screen, regardless of the order they appear in the
source). Your method would probably work, but it's just as much a mess
as my idea. And they can't be stored as bold toggle and italic
toggles, they'll have to be stored as "x apostrophes" in order for
more complicated combinations to work. Your final walk of the tree is
going to end up just as complicated as my first pass through the
wikitext (it's easy to exclude the few places where bold and italics
aren't parsed - it's just pre and nowiki as far as I know, the code
just needs to be exploded in the right places in the same way the
current parser works).
In summary, the syntax is a complete mess, so both our solutions are
complete messes. I'm really not sure which is better, but I don't
think there's much in it. My idea does allow for saving the tidied
version if people want (I'd prefer it to be an option, rather than
happening automatically as someone else suggested), which would be a
nice feature, but far from a vital one. It also allows for tidying
more than just bold and italics if we find anything else that needs
similar treatment (lists, perhaps). Does your idea have any similar
added benefits?