On 06/20/2012 01:02 PM, Niklas Laxström wrote:
No, this is not about a wikitext parser. Rather something much simpler.
Have a look at [1] and you will see rules like: n in 0..1 n is 2 n mod 10 in 3..4,9 and n mod 100 not in 10..19,70..79,90..99
Long ago when I wanted to compare the plural rules of MediaWiki and CLDR I wrote a parser for the CLDR rule format. Unfortunately my implementation uses regular expression and eval, which makes it unsuitable for production. Now, writing parsers is not my area of expertise, so can you please point me how to do this properly with PHP. Bonus points if it is also easily adaptable to JavaScript.
I like the ease of disambiguation in Parsing Expression Grammars (PEG). Most PEG parser generators use memoization to achieve a runtime linear in the input. I have no experience with PEG parser generators for PHP, but am using PEG.js for the Parsoid tokenizer with good results.
If you try a PHP PEG generator, then please let us know about your results!
Gabriel