On Fri, May 02, 2003 at 01:02:13PM -0700, David A. Wheeler wrote:
Clearly speeding execution of the PHP scripts would help. One way is to reduce the work they have to do (e.g., caching the HTML). Another is coding the hotspot (e.g., as a loaded C module). But doing it right requires identifying what the hotspot is in the PHP scripts.
I'm actually in the middle of a C project to reduce the wikitext parser to a two-pass parser. It should reduce the complexity of the wikitext down to a point where the only thing the PHP code will have to do is:
* Handle links / link lookup * Ignore links in <nowiki> (everything else is done) * Handle <math> conversion * ~~~ and ~~~~ * ISBN lookups
Some of that could possibly be moved in there in the future; probably everything but the link lookups and link ignoring.
Everything else should be down by the C module underneath. This should be a significant speed-up. Now, I'm only about a third of the way done with the code, but my lexical analyzer is pretty speedy thus far (about 25000 lines/sec). It currently handles:
---- == === ==== \n\n '' ''' ''''' (better than the current code, which has a problem handling ''''')
I still need to do: * Lists * Manual formatting * <nowiki> conversion