[Wikitext-l] Parser for Perl or PHP?

13 Dec 2009


      Hi,
I'm currently maintaining wikilint (cf.
URI:http://toolserver.org/~timl/cgi-bin/wikilint) that re-
views Wikipedia articles for common problems. At the moment,
it is a powerful, but ugly mess of regular expressions ga-
lore. Fixing bugs is a nightmare.
Ideally, a redesign would parse the source in a tree-like
structure and then work on that. So I went to CPAN and
[[mw:Alternative parsers]] and found out that:
a) there are lots of "release early, release once" "imple-
   mentations" that do not anything useful and do not seem
   to be in further development, and
b) for many people, "parser" seems to have the meaning
   "converter".
So I'll probably have to start another try. As for wikilint
I do not have to be able to parse 100 % of all thinkable wi-
ki markup (if the article cannot be parsed, it probably is
broken anyway), I could go for a rather "lean" approach. For
the tree structure, I would opt for DOM to maximize code re-
usability with wiki markup in a separate namespace. If there
are no relevant fundaments to build on, I would prefer Perl,
ideally enhancing an existing CPAN module like
WWW::Wikipedia::Entry.
Any pointers to things that I overlooked? Thoughts on in-
terfaces & Co.? Volunteers? :-)
Tim

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

[Wikitext-l] Parser for Perl or PHP?

[Wikitext-l] *Parser* for Perl or PHP?

[Wikitext-l] Parser for Perl or PHP?