Cunningham's exploratory parsing - Wikitext-l

11 Jul 2011

Say, while everybody's trying to figure out a formal grammar, have you had a look at
Ward Cunningham's exploratory parsing kit? He gave me a demo at OSBridge, and it's
a really handy tool. Basically, it's a web app with an asynchronous C backend. You
paste a tentative PEG grammar into a textarea, and it runs through whatever corpus you
want, showing you representative instances of how it does or does not match. He was
running it against the full English Wikipedia on his laptop, and it took only half an hour
or something—with results coming in as they were generated, of course.

Using that, they made a PEG-and-then-some implementation of MW syntax that parses darn
near all of Wikipedia: https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg. (I call
it "PEG-and-then-some" because it does have a lot of callbacks which might
interlock with and affect the rule matching—not sure.)

Cheers,
Erik