Re: [Wikitech-l] Parsoid still doesn't love me

10 Nov 2015


      On Mon, Nov 9, 2015 at 1:37 PM, Petr Bena benapetr@gmail.com wrote:
...
Do you really want to say that reading from disk is faster than
processing the text using CPU? I don't know how complex syntax of mw
actually is, but C++ compilers are probably much faster than parsoid,
if that's true. And these are very slow.
What takes so much CPU time in turning wikitext into html? Sounds like
JS wasn't a best choice here.
More fundamentally, the parsing task involves recursive expansion of
templates and image information queries, and popular wikipedia articles can
involve hundreds of templates and image queries.  Caching the result of
parsing allows us to avoid repeating these nested queries, which are a
major contributor to parse time.
One of the benefits of the Parsoid DOM representation[*] is that it will
allow in-place update of templates and image information, so that updating
pages after a change can be by simple substitution and *without* repeating
the actual "parse wikitext" step.
  --scott
[*] This actually requires some tweaks to the wikitext of some popular
templates; https://phabricator.wikimedia.org/T114445 is a decent summary of
the work (although be sure to read to the end of the comments, there's
significant stuff there which I haven't editing into the top-level task
description yet).
-- 
(http://cscott.net)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Parsoid still doesn't love me