On 05/01/2012 09:15 AM, Tim Starling wrote:
In summary: the Lua function is called with a single argument, which is an object representing the parser interface. The object is roughly equivalent to a PPFrame.
+1 for the abstract frame object.
The object would have a property called "args", which is a table with its "index" metamethod overridden to provide lazy-initialised access to the parser function arguments with a brief syntax:
{{#invoke:module|func|name=value}}
function p.func(frame) return frame.args.name --- returns "value" end
There would be two methods for recursive preprocessing:
- preprocess() provides basic expansion of wikitext
An alternative to a wikitext-specific preprocess() method and plain-text argument values could be a conversion / expansion method on an opaque 'parser value' object:
frame.args.name.expandTo( 'text/x-mediawiki' ) --- returns "value"
This would make it possible to work with other formats apart from wikitext.
I recently added an API like this in Parsoid (the method is called 'as' there), and liked the way that worked out for parser functions. I am currently using the 'text/plain' type to retrieve a text expansion with comments etc stripped, and 'tokens/x-mediawiki' for expanded tokens (~list of tags and strings). Additional formats can be supported without a proliferation of methods. Each value object has a reference to its frame, and can be passed around and eventually lazily expanded elsewhere. Expansion results can be cached inside the value object and shared between multiple use sites (the value is associated with a single frame after all).
The Parsoid .as method additionally takes a callback argument to support asynchronous expansions. This might be too complex for user-friendly Lua scripting, but could still be something worth considering in the longer term. It could be added as a separate 'expandToAsync' method.
The conversion of wikitext or other formats to an opaque value object could be achieved using an object constructor:
--- 'value text' is parsed lazily ParserValue( 'text/x-mediawiki', 'value text', frame )
The frame might be the passed-in parent frame, or a custom one constructed with args assembled from other ParserValues.
Calls to existing templates could be supported with a convenient TemplateParserValue constructor, which does not specify how a template call is represented internally.
TemplateParserValue( 'tpl', args ).expandTo( 'text/plain' )
Finally, a ParserValue (or a list of those) could be used for the return type of functions to support output formats other than plain text.
Overall, I would love to keep the access to values as opaque as possible to enable back-end optimizations and lazy expansions with sharing. Opening a path towards content representations other than plain (wiki-)text such as tokens, an AST or a DOM tree should be very useful for future parser development.
Gabriel