Re: [Wikitech-l] Re: An alternate parser

14 Aug 2004


      On Fri, Aug 13, 2004 at 09:38:34PM +0200, Ashar Voultoiz wrote:
...
Brion Vibber wrote:
...
Magnus Manske wrote:
...
I therefore suggest a new structure:

Preprocessor
Wiki markup to XML
XML to (X)HTML

This doesn't actually solve any of the issues with the current parser, 
since it merely has it produce a different output format.
The main problems are that we have a mess of regexps that stomp on each 
other all the time.
-- brion vibber (brion @ pobox.com)
Can't we switch back to the tokenizer parser and try to optimize it ? 
The token approch Seems much easier to maintain.
Character-by-character string parsing in PHP is slow since there is too
much overhead. Tokenizing probably has to be done in a C(++) function.
Another point where the tokenizer was slow was the byte-by-byte composing
of the result string. I've been told that adding small strings to an array
and joining them in the end is much faster, worth a try.
JeLuF

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Re: An alternate parser