Re: [Wikitech-l] GSoC project advice: port texvc to Python?

23 Mar 2010


      Conrad Irwin wrote:
...
On 03/23/2010 05:23 PM, Aryeh Gregor wrote:
...
On Tue, Mar 23, 2010 at 1:00 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
...
DFAs parse regular languages, which means those languages can also be
expressed as regexes. In fact, the regexes accepted by the preg_*()
functions allow certain extensions to the language theory definition
of regular expressions, allowing them to describe certain non-regular
languages as well. In short: preg_split() can do everything a DFA can
do, and more. The only reason to use a DFA parser would be
performance, but since the preg_*() functions are so heavily optimized
I don't think that'll be an issue.
This much I know, but is LaTeX actually a regular language?
It's not even context free, luckily the subset we are interested in is
(as clearly shown by the texvc parser :p).
Just because a language is context-sensitive doesn't mean it will be
hard to write a parser for it. That's just a myth propagated by
computer scientists who, strangely enough given their profession, have
a disdain for the algorithm as a descriptive framework.
In the last few decades, pure mathematicians have been exploring the
power of algorithms as a general description of an axiomatic system.
And simultaneously, computer scientists have embraced the idea that
the best way to process text is by trying to shoehorn all computer
languages into some Chomsky-inspired representation, regardless of how
awkward that representation is, or how inefficient the resulting
algorithm becomes, when compared to an algorithm constructed a priori.
-- Tim Starling

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] GSoC project advice: port texvc to Python?