On 21/02/2008, DanTMan <dan_the_man(a)telus.net> wrote:
There's no way to convert WikiText directly into
XML in the way you imply
without a proper grammar. Which if created, will already void the point of
converting over. And you can't use the converted XHTML to come up with a
good way of representing the data inputed.
Next in line, is the notion of ditching WikiText without converting it.
Honestly, is that even sane... There are over ... articles on Wikipedia
alone. Not to mention the other large sites like Wikia. Ditching a language
used everywhere like that, is like removing all definition of the Chinese
language and telling them that Chinese can no longer be used. You get
riots...
Now if you say that those sites can stick with WikiText. Then there's
little point in creating a new language, because there's little point in a
new parser language if Wikipedia is not using it. That's like telling the
WikiMedia Foundation to cut off Wikipedia's upgrades because it's getting to
old.
The actual state of things is:
1. Wikitext is literally defined as "whatever the present software
does." This is bad.
2. There have been several attempts to write a grammar. The latest one
is looking promising for completeness (though ANTLR is slow and
buggy).
3. A replacement grammar can be used for third-party implementations
(WYSIWYG, XML, etc) with perfect fidelity.
4. Any replacement grammar will only replace the present
implementation if it (a) covers present behaviour sufficiently (b) is
fast enough.
Minh Lê Ngọc's reasoning is quite correct, but we need a replacement
that is (a) sufficiently complete (whatever "sufficient" is) (b) works
*better* than the present one. Then we can keep wikitext in all its
kludgy glory *and* do XML versions and good WYSIWYG editors that
aren't just WYSIAYG and so forth.
Current status of ANTLR-based parser: somewhere between promising
vapourware and unreleased early alpha.
- d.