On 21/02/2008, DanTMan dan_the_man@telus.net wrote:
There's no way to convert WikiText directly into XML in the way you imply without a proper grammar. Which if created, will already void the point of converting over. And you can't use the converted XHTML to come up with a good way of representing the data inputed. Next in line, is the notion of ditching WikiText without converting it. Honestly, is that even sane... There are over ... articles on Wikipedia alone. Not to mention the other large sites like Wikia. Ditching a language used everywhere like that, is like removing all definition of the Chinese language and telling them that Chinese can no longer be used. You get riots... Now if you say that those sites can stick with WikiText. Then there's little point in creating a new language, because there's little point in a new parser language if Wikipedia is not using it. That's like telling the WikiMedia Foundation to cut off Wikipedia's upgrades because it's getting to old.
The actual state of things is:
1. Wikitext is literally defined as "whatever the present software does." This is bad. 2. There have been several attempts to write a grammar. The latest one is looking promising for completeness (though ANTLR is slow and buggy). 3. A replacement grammar can be used for third-party implementations (WYSIWYG, XML, etc) with perfect fidelity. 4. Any replacement grammar will only replace the present implementation if it (a) covers present behaviour sufficiently (b) is fast enough.
Minh Lê Ngọc's reasoning is quite correct, but we need a replacement that is (a) sufficiently complete (whatever "sufficient" is) (b) works *better* than the present one. Then we can keep wikitext in all its kludgy glory *and* do XML versions and good WYSIWYG editors that aren't just WYSIAYG and so forth.
Current status of ANTLR-based parser: somewhere between promising vapourware and unreleased early alpha.
- d.