Re: [Wikitech-l] GSoC project advice: port texvc to Python?

23 Mar 2010


      2010/3/23 Aryeh Gregor Simetrical+wikilist@gmail.com:
...
...
I've never used PHP for real programming, but how difficult would it be
to write a really simple, stupid first pass at a DFA parser? I suspect
I'd need much more than three months to make it useful, but would it be
possible to implement some coherent subset of the features? E.g.,
building the LR0 automaton, at least?
I don't think you'd need a "real" parser here.  Mostly we just use
preg_split() for this sort of thing.  I'm not familiar with formal
grammars and such, so I can't say what the concrete disadvantages of
that approach are.
DFAs parse regular languages, which means those languages can also be
expressed as regexes. In fact, the regexes accepted by the preg_*()
functions allow certain extensions to the language theory definition
of regular expressions, allowing them to describe certain non-regular
languages as well. In short: preg_split() can do everything a DFA can
do, and more. The only reason to use a DFA parser would be
performance, but since the preg_*() functions are so heavily optimized
I don't think that'll be an issue.
...
...
I suggested a Python port because
   http://www.mediawiki.org/wiki/Summer_of_Code_2010#MediaWiki_core
lists it as a potential project idea. I was under the impression that
people around here did not want to leave texvc in OCaml. Is this wrong?
No, it's right.  Conrad is crazy.  :P
Having it in a language no one understands is a bad thing and leads to
maintenance not happening, so yeah, we definitely want it rewritten in
PHP. If the PHP implementation turns out to be too slow to run on WMF,
for instance, we could do a C++ port à la wikidiff2 (a C++ port of our
ludicrously slow PHP diff implementation).
Roan Kattouw (Catrope)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] GSoC project advice: port texvc to Python?