Thank you for bringing those issues to the public discussion, Tim,
they are really worth it.
On Tue, May 1, 2012 at 11:15 AM, Tim Starling <tstarling(a)wikimedia.org> wrote:
I've written up a proposed interface between the
MediaWiki parser and Lua:
<https://www.mediawiki.org/wiki/Extension:Scribunto/Parser_interface_design>
In summary: the Lua function is called with a single argument, which
is an object representing the parser interface. The object is roughly
equivalent to a PPFrame.
The object would have a property called "args", which is a table with
its "index" metamethod overridden to provide lazy-initialised access
to the parser function arguments with a brief syntax:
{{#invoke:module|func|name=value}}
function p.func(frame)
return frame.args.name --- returns "value"
end
I like this part. Also, I really enjoy the idea of making a separate
parser frame for script instead of running it in the parent template's
frame.
I am a bit leery though about the part where you suggest that
name-value arguments ({{#invoke:module|func|param=value}}) should be
parsed by engine, not the script. Don't you have to expand those
arguments in order to parse them, hence making any form of
lazy-expanding impossible?
There would be two methods for recursive
preprocessing:
* preprocess() provides basic expansion of wikitext
* callTemplate() provides an API for template invocation, since I
imagine that would otherwise be a common use case for preprocess().
Using preprocess() to expand a template with arbitrary arguments would
be difficult.
Like a normal parser function, the Lua function returns text which is
not modified any further by the preprocessor.
This is the part which I strongly oppose. Providing direct
preprocessor access to Lua scripts is a bad idea. There are two key
reasons for this:
1. Preprocessor is slow.
2. You would have to work out many very subtle issues with time out
and nested Lua scripts. This includes timeout subtleties caused by the
preprocessor slowness (load a slow template, and given the small Lua
time limit, it will cause PHP to show a fatal error due to emergency
timeout; even if you fix it, the standalone version uses ulimit, and
it may be more difficult to fix).
Now, let me go through your suggested use cases and propose some alternatives:
1. As an alternative to a string literal, to include snippets of
wikitext which are intended to be editable by people who don't know
Lua.
I think it would be in fact better if you provided an interface for
getting unprocessed wikitext. Or a preprocessor DOM. Preprocessed text
makes it is difficult to combine human-readable and machine-readable
versions.
2. During migration, to call complex metatemplates which have not yet
been ported to Lua, or to test migrated components independently
instead of migrating all at once.
That would eventually lead them to becoming permanent. Bugzilla quips,
an authoritative reference on Wikimedia practices, says that
"temporary solutions have a terrible habit of becoming permanent,
around here". Hence I would suggest that we avoid the temptation in
first place.
3. To provide access to miscellaneous parser functions and variables.
Now, this is a really bad idea. It is like making a scary hack an
official way to do things. It actually defies the first design
principle you state. preprocess( "{{FULLPAGENAME}}" ) is not only much
more uglier than using appropriate API like mw.page.name(), it is also
a one of the slowest ways to do this. I have benchmarked it, and it is
actually ~450 times slower than accessing the title object directly.
Lua was (and is) meant to improve the readability of templates, not to
clutter them with stuff like articlesNum = tonumber( preprocess(
"{{NUMBEROFARTICLES:R}}" ) ).
Solution: proper API would do the job (actually I am currently working on it).
4. To allow Lua to construct tag invocations, such as <ref> and <gallery>.
We could make a #tag-like function to do this, just as we do with
parser functions.
I feel myself much more comfortable with the original return {expand =
true} idea, which causes the wikitext to be expanded in the new
Scribunto call frame.
Please see the wiki page for a more detailed
description, including
rationale.
Thank you for writing such a detail description.
I am a bit puzzled about the "always use named arguments scheme" part,
because it is not how the standard Lua library works.
I guess that's all my concerns for now.
Thanks,
Victor.