Here's one that uses lex + single pass over tokens, and can generate valid XHTML. It's proof-of-concept program, not a drop-in replacement for current parser.
=== How it works ===
Wiki syntax is line based and uses state-transition model - a token chances state from anything to X. HTML is free-form and states can nest.
This parser maintains stack of "inline" elements. Every time it finds </X>, it checks if <X> is on stack, and if it is, it pops and closes every element till it gets to <X>, otherwise it prints raw </X>.
If it finds <X>, it checks if it conflicts with something on stack, and acts accordingly.
When "paragraph state" has to change, it closes all open inline tags.
It doesn't preserve whitespace unless necessary (<pre> and wiki pre, for now also <nowiki>).
Example: <<END Ala <b>ma kota
Ala <b>ma <b>kota
Ala </b>ma kota
Ala <strong> <b>ma kota
Ala <i> <b>ma kota END
Output (with \ns inserted): <<END <p>Ala <b>ma kota </b></p> <p>Ala <b>ma <b>kota </b></p> <p>Ala </b>ma kota </p> <p>Ala <strong> <b>ma kota </strong></p> <p>Ala <i> <b>ma kota </b></i></p> END
Because it has to support both HTML and wiki paragraph control, it's quite ugly code.
Example: <<END <ul> <li> Ala </li> <li> Ma <li> Kota <ul> i <li> Psa </ul> END
Output: <<END <ul> <li> Ala </li> <li> Ma </li><li> Kota </li><ul> <li>i </li><li> Psa </li></ul> </ul> END
As you can see above, <li> was automatically opened in nested list.
=== More examples of magic ===
Example: <<END === Foo === Bar END
Output: <<END <h3> Foo </h3><p>Bar </p> END
But also:
Example: <<END === Foo Bar END
Output: <<END <h3> Foo </h3><p>Bar </p> END
=== '''-magic ===
It reopens quote if necessary.
Example (Quotes.txt from test suite) <<END Wikipedia quoting tests:
(1) normal '''bold''' normal
(2) normal ''italic'' normal
(3) normal '''''bold italic''''' normal
(4) normal '''bold ''bold italic'' bold''' normal
(5) normal ''italic '''bold italic''' italic'' normal
(6) normal '''''bold italic'' bold''' normal
(7) normal '''''bold italic''' italic'' normal
(8) normal ''italic '''bold italic''''' normal
(9) normal '''bold ''bold italic''''' normal
(10) normal '''bold's''' normal
(11) normal ''italic's'' normal
(12) normal ''italic's '''bold's italic''' italic's'' normal
(13) normal '''''bold's italic'' bold's''' normal
(14) normal ''italic''' normal
(15) normal ''''bold''' normal
(16) normal ''italic'' normal ''italic'' normal
(17) normal ''italic'' normal '''bold''' normal
(18) normal '''bold''' normal '''bold''' normal
(19) normal '''bold''' normal ''italic'' normal END
Output (with \ns inserted): <<END <p>Wikipedia quoting tests: </p> <p>(1) normal <b>bold</b> normal </p> <p>(2) normal <i>italic</i> normal </p> <p>(3) normal <b><i>bold italic</i></b> normal </p> <p>(4) normal <b>bold <i>bold italic</i> bold</b> normal </p> <p>(5) normal <i>italic <b>bold italic</b> italic</i> normal </p> <p>(6) normal <b><i>bold italic</i> bold</b> normal </p> <p>(7) normal <b><i>bold italic</i></b><i> italic</i> normal </p> <p>(8) normal <i>italic <b>bold italic</b></i> normal </p> <p>(9) normal <b>bold <i>bold italic</i></b> normal </p> <p>(10) normal <b>bold's</b> normal </p> <p>(11) normal <i>italic's</i> normal </p> <p>(12) normal <i>italic's <b>bold's italic</b> italic's</i> normal </p> <p>(13) normal <b><i>bold's italic</i> bold's</b> normal </p> <p>(14) normal <i>italic<b> normal </b></i></p> <p>(15) normal <b>'bold</b> normal </p> <p>(16) normal <i>italic</i> normal <i>italic</i> normal </p> <p>(17) normal <i>italic</i> normal <b>bold</b> normal </p> <p>(18) normal <b>bold</b> normal <b>bold</b> normal </p> <p>(19) normal <b>bold</b> normal <i>italic</i> normal </p> END
7 is not optimal but still 100% correct. 14 has different interpretation.