Thank you for bringing those issues to the public discussion, Tim, they are really worth it.
On Tue, May 1, 2012 at 11:15 AM, Tim Starling tstarling@wikimedia.org wrote:
I've written up a proposed interface between the MediaWiki parser and Lua:
https://www.mediawiki.org/wiki/Extension:Scribunto/Parser_interface_design
In summary: the Lua function is called with a single argument, which is an object representing the parser interface. The object is roughly equivalent to a PPFrame.
The object would have a property called "args", which is a table with its "index" metamethod overridden to provide lazy-initialised access to the parser function arguments with a brief syntax:
{{#invoke:module|func|name=value}}
function p.func(frame) return frame.args.name --- returns "value" end
I like this part. Also, I really enjoy the idea of making a separate parser frame for script instead of running it in the parent template's frame.
I am a bit leery though about the part where you suggest that name-value arguments ({{#invoke:module|func|param=value}}) should be parsed by engine, not the script. Don't you have to expand those arguments in order to parse them, hence making any form of lazy-expanding impossible?
There would be two methods for recursive preprocessing:
- preprocess() provides basic expansion of wikitext
- callTemplate() provides an API for template invocation, since I
imagine that would otherwise be a common use case for preprocess(). Using preprocess() to expand a template with arbitrary arguments would be difficult.
Like a normal parser function, the Lua function returns text which is not modified any further by the preprocessor.
This is the part which I strongly oppose. Providing direct preprocessor access to Lua scripts is a bad idea. There are two key reasons for this: 1. Preprocessor is slow. 2. You would have to work out many very subtle issues with time out and nested Lua scripts. This includes timeout subtleties caused by the preprocessor slowness (load a slow template, and given the small Lua time limit, it will cause PHP to show a fatal error due to emergency timeout; even if you fix it, the standalone version uses ulimit, and it may be more difficult to fix).
Now, let me go through your suggested use cases and propose some alternatives:
1. As an alternative to a string literal, to include snippets of wikitext which are intended to be editable by people who don't know Lua. I think it would be in fact better if you provided an interface for getting unprocessed wikitext. Or a preprocessor DOM. Preprocessed text makes it is difficult to combine human-readable and machine-readable versions.
2. During migration, to call complex metatemplates which have not yet been ported to Lua, or to test migrated components independently instead of migrating all at once. That would eventually lead them to becoming permanent. Bugzilla quips, an authoritative reference on Wikimedia practices, says that "temporary solutions have a terrible habit of becoming permanent, around here". Hence I would suggest that we avoid the temptation in first place.
3. To provide access to miscellaneous parser functions and variables. Now, this is a really bad idea. It is like making a scary hack an official way to do things. It actually defies the first design principle you state. preprocess( "{{FULLPAGENAME}}" ) is not only much more uglier than using appropriate API like mw.page.name(), it is also a one of the slowest ways to do this. I have benchmarked it, and it is actually ~450 times slower than accessing the title object directly. Lua was (and is) meant to improve the readability of templates, not to clutter them with stuff like articlesNum = tonumber( preprocess( "{{NUMBEROFARTICLES:R}}" ) ). Solution: proper API would do the job (actually I am currently working on it).
4. To allow Lua to construct tag invocations, such as <ref> and <gallery>. We could make a #tag-like function to do this, just as we do with parser functions.
I feel myself much more comfortable with the original return {expand = true} idea, which causes the wikitext to be expanded in the new Scribunto call frame.
Please see the wiki page for a more detailed description, including rationale.
Thank you for writing such a detail description.
I am a bit puzzled about the "always use named arguments scheme" part, because it is not how the standard Lua library works.
I guess that's all my concerns for now.
Thanks, Victor.