-----Original Message-----
From: wikitech-l-bounces(a)lists.wikimedia.org
[mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of
Aryeh Gregor
Sent: 25 September 2009 23:01
To: Wikimedia developers
Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards
branchingMediaWiki 1.16)
On Fri, Sep 25, 2009 at 3:46 PM, Steve Sanbeg <ssanbeg(a)ask.com>
wrote:
> I'm not sure that's entirely accurate.
XSLT works on DOM trees,
so
malformed XML
shouldn't really apply. Of course, the
standard command
> line processors create this tree with a standard parser, usually
an
XML parser.
But in PHP, creating the DOM with a parser and
transforming it with XSLT are handled separately.
Interesting. In that case, theoretically, you could use an
HTML5 parser, which is guaranteed to *always* produce a DOM
even on random garbage input (much like wikitext!). Now,
who's up for writing an
HTML5 parser in PHP whose performance is acceptable? I thought not.
:P
libxml2, and therefore PHP has a tag soup HTML 4 parser.
DOMDocument::loadHTML()
http://xmlsoft.org/html/libxml-HTMLparser.html
Jared