Re: [Wikitech-l] Parser postprocessor

1 Feb 2011


      On Mon, Jan 31, 2011 at 4:55 PM, Trevor Parscal tparscal@wikimedia.org wrote:
...
Adding yet another discreet parsing step is the reverse of what a lot of people hoping to clean up wikitext are heading towards.
What system do you propose that would retain the performance benefits
of this suggestion, and be deployable in the near future?  A simple
postprocessor would be very useful -- you could save greatly on parser
cache fragmentation, if not eliminate it entirely.  E.g., as Daniel
notes, you could leave a marker in the parser output where section
links should go, and have the postprocessor fill it in depending on
user language, so we don't fragment the cache for language.  More
importantly, a postprocessor would allow us to add new features that
are currently unacceptable due to cache fragmentation.
...
What some of us have been kicking around would be migrating away from pre-procesing the text at all. Instead the text should be parsed in a single step into an intermediate structure that is neither wikitext nor HTML. Templates would be required to return whole structures when expanded (open what you close, close what you open) and would only be present in sanitary places (not in the middle of wiki or HTML syntax for instance).
This is possibly a good long-term goal, but I don't see how it
conflicts with a postprocessing step at all.  As long as parsing large
pages requires significant CPU time, we'll want to cache the parsed
output as much as possible, and a postprocessor will always help to
reduce cache fragmentation.  If we ever do move to a storage format
that's so fast to process that we don't care about cache misses, of
course, we could scrap the preprocessor and incorporate its effects
into the main pass, no harm done.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Parser postprocessor