-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Aryeh Gregor Sent: 25 September 2009 23:01 To: Wikimedia developers Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On Fri, Sep 25, 2009 at 3:46 PM, Steve Sanbeg ssanbeg@ask.com
wrote:
I'm not sure that's entirely accurate. XSLT works on DOM trees,
so
malformed XML shouldn't really apply. Of course, the
standard command
line processors create this tree with a standard parser, usually
an
XML parser. But in PHP, creating the DOM with a parser and transforming it with XSLT are handled separately.
Interesting. In that case, theoretically, you could use an HTML5 parser, which is guaranteed to *always* produce a DOM even on random garbage input (much like wikitext!). Now, who's up for writing an HTML5 parser in PHP whose performance is acceptable? I thought not. :P
libxml2, and therefore PHP has a tag soup HTML 4 parser.
DOMDocument::loadHTML()
http://xmlsoft.org/html/libxml-HTMLparser.html
Jared