Re: [Wikitech-l] Slow parser

13 Feb 2002

On mer, 2002-02-13 at 07:36, Magnus Manske wrote:
...
  I just ran "ab n=10" for an atricle (with
cache turned off) and deactivated
 some functions to see where the slow parts are.

 Full rendering : 4.99 sec
 removeHTMLtags turned off : 3.319 sec 
How much time does parseContents() take?

...
  It seems removeHTMLtags is responsible for 1/3 of the
*total* runtime, which
 includes apache, php calling, and a thousand other things that can't be
 avoided. 
Well, it can be made much more efficient... As Jan has hinted, explode()
is a killer, and I can take that out.

...
  So, if these HTML tags are *never* used anyway, why
can't we replace them
 with &lt; and &gt; just prior to saving an edited article? 
I just have two objections:

First, it violates the principal of least surprise; the user doesn't get
the same thing upon a re-edit that they left during the last edit. This
is particularly annoying for people who are putting complicated tables
into articles (cf. [[Beryllium]] and [[Periodic Table]]) -- if they do
one thing wrong, POOF! Half their table <tags> suddenly turn into
&lt;tags&gt; and instead of fixing one tiny mistake, they fix one tiny
mistake AND change a lot of &lt;&gt;s back into <>s.
  Conclusion: bad for users.

Second, enforcing the limited subset HTML is just a part of the wiki
parsing. Doing that on save is fine, but is basically doing half the
parsing job and caching that, then doing the other half when we display
the page. Why stop there, when we could just parse the wiki-specific
code while we're at it and save the final result?
  Conclusion: what exactly is our goal here? To save processing time on
page load? This is most effectively done by caching the completely
parsed version, both HTML and wiki -> HTML.

...
  I'll be gone tomorrow until Saturday, and I doubt
I can hack it today, so
 it's up to you... 
-- brion vibber (brion @ pobox.com)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Slow parser