Tim Starling wrote:
If the only thing missing from JAMWiki was
ParserFunctions, that would be
very impressive. ParserFunctions is simple. And indeed, there's a lot of
really impressive code in there, although it's easy to find edge cases
that don't work the same way.
True; it was just one of the first things we ran into with basic
rendering of Wikipedia pages.
For str_repeat("[http://a] ", 1000), it took
so long that I gave up
waiting. MediaWiki does either of these things in linear time, on the
order of hundreds of microseconds per loop.
[...]
It's unfortunate that a modern parser generator
for a supposedly fast
language like Java can't match hand-optimised PHP for speed. It's not like
we've set a high bar here.
I'm not sure about not having set a high bar... However, we can confirm
the parser generator vs hand-optimized parser issue. You just showed
that JFlex, the parser generator used by JAMWiki doesn't scale up
nicely. We found the same for ANTLR, another parser generator for Java,
which also doesn't perform as well as MediaWiki when run against
stripped down pages (our parser parses Wiki Creole which on a stripped
down level is equivalent to MediaWiki syntax) [1]. MediaWiki performed
equally well or better; in general I think the advantage of parser
generator is easier maintainability and clarity of the language (you can
view the grammar as a domain-specific language for describing acceptable
syntax), but not performance :-(
Thanks for your insights!
Dirk
[1]
http://www.riehle.org/2008/07/19/a-grammar-for-standardized-wiki-markup/
--
Phone: + 1 (650) 215 3459, Web:
http://www.riehle.org