Steve Bennett wrote:
I was just reading this: http://www.riehle.org/wp-content/uploads/2008/01/a5-junghans.pdf
And wondering if there is any desire (let alone plans) to move to a system of storing a different internal representation (eg, XML) and separating the display logic out. One obvious benefit would be making it easier to produce different outputs without having to write multiple parsers. Are there others? Would Wikipedia benefit from supporting an interchange format?
It's entirely impossible as stated, due to the existence of the preprocessing step. Changing a template or variable may radically change the HTML document tree, generating changes distant from the template invocation.
The new preprocessor has an intermediate XML representation for pages before template inclusion, and it would be possible to store it. There's a RECOVER_ORIG mode that allows the original wikitext to be recovered from the XML. The problems with using it as a storage format are:
* It's useless as an interchange format since it still depends on thousands of lines of MediaWiki code to generate HTML from it. * The XML format, and the details of the transformation, are subject to change. * Transformation from wikitext to preprocessed XML is relatively fast, and will hopefully get faster with further development, so it can be generated on demand for any application that needs it.
If you just want an open interchange format for fully preprocessed, template-free wikitext, then MediaWiki already has one. It's called XHTML.
-- Tim Starling