[Wikitech-l] Parsoid's progress

19 Jan 2015


      (Combining pieces of Jay's thread and pieces of the shared hosting thread.)
Daniel Friesen wrote:
...
Parsoid can do Parsoid DOM to WikiText conversions. So I believe the
suggestion is that storage be switched entirely to the Parsoid DOM and
WikiText in classic editing just becomes a method of editing the content
that is stored as Parsoid DOM in the backend.
Tim Starling wrote:
...
Parsoid depends on the MediaWiki parser, it calls it via api.php. It's
not a complete, standalone implementation of wikitext to HTML
transformation.
HTML storage would be a pretty simple feature, and would allow
third-party users to use VE without Parsoid. It's not so simple to use
Parsoid without the MediaWiki parser, especially if you want to support
all existing extensions.
So, as currently proposed, HTML storage is actually a way to reduce the
dependency on services for non-WMF wikis, not to increase it.
Based on recent comments from Gabriel and Subbu, my understanding is
that there are no plans to drop the MediaWiki parser at the moment.
Yeah... what is this all about? My understanding (and please correct me if
I'm wrong) is that Parsoid is/was intended to be a standalone service
capable of translating wikitext <--> HTML. You seem to be stating that
Parsoid is neither complete nor standalone. Why?
Currently Parsoid is the largest client of the MediaWiki PHP parser, I'm
told. If Parsoid is regularly calling and relying upon the MediaWiki PHP
parser, what exactly is the point of Parsoid?
How much parity is there between Parsoid without the use of the MediaWiki
parser and the MediaWiki parser? That is, if you selected a random sample
of pages from a Wikimedia wiki, how many of them could Parsoid correctly
parse on its own? And from this question flows another: why is Parsoid
calling MediaWiki's api.php so regularly?
I'm also interested in Parsoid's development as it relates to the broader
push for services. If Parsoid is going to be the model of future services
development, I'd like a clearer evaluation of what kind of model it is.
Again, please correct me if I'm wrong, mistaken, misinformed, etc., but
from my place of limited knowledge, it sounds very unappealing to create
large Node.js applications ("services") that closely tie in and require(!)
PHP counterparts. This seems like the opposite of moving toward a more
flexible, modular architecture. From my perspective, it would seem to only
saddle us with additional technical debt moving forward, as we double
complexity indefinitely.
MZMcBride

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Parsoid's progress