On Wed, May 2, 2012 at 4:21 AM, Tim Starling <tstarling(a)wikimedia.org> wrote:
We can limit the input size, or temporarily reduce the
general parser
limits like post-expand include size and node count. We can also hook
into PPFrame::expand() to periodically check for a Lua timeout, if
that is necessary.
The preprocessor is slow now, it won't become slower by allowing Lua
to call it.
What I meant is that one of the goals of Lua project is to improve the
performance of template system, and by invoking the preprocessor you
slow it down because of parser overhauls.
2. You would
have to work out many very subtle issues with time out
and nested Lua scripts. This includes timeout subtleties caused by the
preprocessor slowness (load a slow template, and given the small Lua
time limit, it will cause PHP to show a fatal error due to emergency
timeout; even if you fix it, the standalone version uses ulimit, and
it may be more difficult to fix).
The scenario you give in brackets will not happen. If a Lua timeout
occurs when the parser is executing, the Lua script will terminate
when the parser returns control to it. The timeout is not missed.
But the parser working time would still be included in normal Lua time limit?
It doesn't matter if there are several levels of
parser/Lua recursion
when a timeout occurs. LuaSandbox is able to unwind the stack efficiently.
What I meant is that it should be able to handle the time limit
correctly and avoid things like doubling time because of the nested
scripts.
[...]
1. As an
alternative to a string literal, to include snippets of
wikitext which are intended to be editable by people who don't know
Lua.
I think it would be in fact better if you provided an interface for
getting unprocessed wikitext. Or a preprocessor DOM. Preprocessed text
makes it is difficult to combine human-readable and machine-readable
versions.
Maybe you are thinking of some sort of virtual wikidata system
involving extracting little snippets of text from infobox invocations
or something. I am not. I would rather use the real wikidata for that.
I am talking about the usual situation around there when the same data
(say, list of TFAs) is displayed in a variety of ways among the wiki.
I am talking about including large, wikitext-formatted
chunks of
content language.
Well, then you can just dump its content into an output and tell
parser to expand it.
2. During
migration, to call complex metatemplates which have not yet
been ported to Lua, or to test migrated components independently
instead of migrating all at once.
That would eventually lead them to becoming permanent. Bugzilla quips,
an authoritative reference on Wikimedia practices, says that
"temporary solutions have a terrible habit of becoming permanent,
around here". Hence I would suggest that we avoid the temptation in
first place.
I don't think it's morally wrong to provide a migration tool.
Migration will be a huge task, and will continue for years. People who
migrate metatemplates to Lua will need lots of tools.
Agreed.
(though I am still skeptical about preprocess() and believe there
might be pitfalls with this we are not currently seeing)
3. To provide
access to miscellaneous parser functions and variables.
Now, this is a really bad idea. It is like making a scary hack an
official way to do things. It actually defies the first design
principle you state. preprocess( "{{FULLPAGENAME}}" ) is not only much
more uglier than using appropriate API like mw.page.name(), it is also
a one of the slowest ways to do this. I have benchmarked it, and it is
actually ~450 times slower than accessing the title object directly.
Lua was (and is) meant to improve the readability of templates, not to
clutter them with stuff like articlesNum = tonumber( preprocess(
"{{NUMBEROFARTICLES:R}}" ) ).
Solution: proper API would do the job (actually I am currently working on it).
We can provide an API for such things at some point in the future. I
am not very keen on just merging whatever interface you are privately
working on, without any public review.
Neither am I.
I am publishing my proposed interface before I write
the code for it,
so that I can respond to the comments on it without appearing to be
too invested in any given solution. I wish that you would occasionally
do the same.
By "working" I meant prototyping the API with some demo functions and
writing a proposed API description for public review.
Rewriting code that you've spent many hours on can
be
emotionally difficult. Perhaps that's why you've made no more changes
to ustring.c despite the problems with its interface.
ustring.c work is on hold because of the problems with pure Lua
implementation design issues. I probably will include it into an API
proposal and discuss it together with other API issues.
4. To allow
Lua to construct tag invocations, such as <ref> and <gallery>.
We could make a #tag-like function to do this, just as we do with
parser functions.
I feel myself much more comfortable with the original return {expand =
true} idea, which causes the wikitext to be expanded in the new
Scribunto call frame.
That would lead to double-expansion in cases where text derived from
input arguments need to be concatenated with wikitext to be expanded.
Consider:
return {
expand = true,
text = formatHeader( frame.args.gallery_header ) .. '\n' ..
'<gallery>' .. images .. '</gallery>' }
formatHeader( "{{{gallery_header}}}" )?
I am a bit
puzzled about the "always use named arguments scheme" part,
because it is not how the standard Lua library works.
It gives flexibility for future development. That was not a core
principle driving the design of the standard Lua library.
Agreed.
Thanks for detailed response,
Victor.