On Wed, Jan 5, 2011 at 7:35 PM, Jay Ashworth <jra(a)baylink.com> wrote:
----- Original Message -----
From: "David Gerard"
<dgerard(a)gmail.com>
Many, many bright people have dashed their
foreheads against the
problem.
Andreas Jonsson thinks he's largely cracked it:
http://davidgerard.co.uk/notes/2010/08/22/staring-into-the-eye-of-cthulhu/
- and even that required custom patches to ANTLR. The result runs in C
and is of comparable speed to PHP.
I suspect it was Steve Bennett's attack run I was remembering.
Did anyone ever pull statistics about exactly how many instances of that
Last Five Percent there really were, as I suspect I suggested at the time?
Cheers,
-- jra
Expansion off "how many instances..?" -
At some point in the corner, the fix is to change the templates and
pages to match a more sane parser's capabilities or a more standard
specification for the markup, rather than make the parser match the
insanity that's already out there.
If we know what we're looking at, we can assign corner cases to an
on-wiki cleanup "hit squad". Who knows how many of the corners we can
outright assassinate that way, but it's worth a go... The less used
it is and harder to code for it is, the easier it is for us to justify
taking it out.
--
-george william herbert
george.herbert(a)gmail.com