I'm making a quick summary of all the steps the parser goes through currently, partly to get familiar with the parser. For each one I'll then attach some BNF. Then work out a way of merging the BNFs.
I'm not sure if there's a systematic approach to this. Basic problem:
1. Parser translates X into Y by applying rule A, expressable in BNF. 2. Parser translates Y into Z by applying rule B, expressable in BNF.
What rule captures both of these in one step? Is it always possible? Is there a general algorithm for the merge?
There are currently 13 distinct major steps, of which one ("Internal parse") has 14 distinct substeps (not counting hooks). That's also not counting the preprocessor, so no templates...
Is this a good way to progress towards a complete grammar? The existing approach seemed to be to simply start from scratch, using a combination of intuition, testing and examining the code. Any comments?
Perhaps at the least we could start compressing some of these layers. Does anyone know of two layers that would be impossible to merge? Presumably the preprocessor has to remain separate at the very least.
Steve
wikitech-l@lists.wikimedia.org