On Wed, May 2, 2012 at 4:21 AM, Tim Starling tstarling@wikimedia.org wrote:
We can limit the input size, or temporarily reduce the general parser limits like post-expand include size and node count. We can also hook into PPFrame::expand() to periodically check for a Lua timeout, if that is necessary.
The preprocessor is slow now, it won't become slower by allowing Lua to call it.
What I meant is that one of the goals of Lua project is to improve the performance of template system, and by invoking the preprocessor you slow it down because of parser overhauls.
- You would have to work out many very subtle issues with time out
and nested Lua scripts. This includes timeout subtleties caused by the preprocessor slowness (load a slow template, and given the small Lua time limit, it will cause PHP to show a fatal error due to emergency timeout; even if you fix it, the standalone version uses ulimit, and it may be more difficult to fix).
The scenario you give in brackets will not happen. If a Lua timeout occurs when the parser is executing, the Lua script will terminate when the parser returns control to it. The timeout is not missed.
But the parser working time would still be included in normal Lua time limit?
It doesn't matter if there are several levels of parser/Lua recursion when a timeout occurs. LuaSandbox is able to unwind the stack efficiently.
What I meant is that it should be able to handle the time limit correctly and avoid things like doubling time because of the nested scripts.
[...]
- As an alternative to a string literal, to include snippets of
wikitext which are intended to be editable by people who don't know Lua. I think it would be in fact better if you provided an interface for getting unprocessed wikitext. Or a preprocessor DOM. Preprocessed text makes it is difficult to combine human-readable and machine-readable versions.
Maybe you are thinking of some sort of virtual wikidata system involving extracting little snippets of text from infobox invocations or something. I am not. I would rather use the real wikidata for that.
I am talking about the usual situation around there when the same data (say, list of TFAs) is displayed in a variety of ways among the wiki.
I am talking about including large, wikitext-formatted chunks of content language.
Well, then you can just dump its content into an output and tell parser to expand it.
- During migration, to call complex metatemplates which have not yet
been ported to Lua, or to test migrated components independently instead of migrating all at once. That would eventually lead them to becoming permanent. Bugzilla quips, an authoritative reference on Wikimedia practices, says that "temporary solutions have a terrible habit of becoming permanent, around here". Hence I would suggest that we avoid the temptation in first place.
I don't think it's morally wrong to provide a migration tool. Migration will be a huge task, and will continue for years. People who migrate metatemplates to Lua will need lots of tools.
Agreed.
(though I am still skeptical about preprocess() and believe there might be pitfalls with this we are not currently seeing)
- To provide access to miscellaneous parser functions and variables.
Now, this is a really bad idea. It is like making a scary hack an official way to do things. It actually defies the first design principle you state. preprocess( "{{FULLPAGENAME}}" ) is not only much more uglier than using appropriate API like mw.page.name(), it is also a one of the slowest ways to do this. I have benchmarked it, and it is actually ~450 times slower than accessing the title object directly. Lua was (and is) meant to improve the readability of templates, not to clutter them with stuff like articlesNum = tonumber( preprocess( "{{NUMBEROFARTICLES:R}}" ) ). Solution: proper API would do the job (actually I am currently working on it).
We can provide an API for such things at some point in the future. I am not very keen on just merging whatever interface you are privately working on, without any public review.
Neither am I.
I am publishing my proposed interface before I write the code for it, so that I can respond to the comments on it without appearing to be too invested in any given solution. I wish that you would occasionally do the same.
By "working" I meant prototyping the API with some demo functions and writing a proposed API description for public review.
Rewriting code that you've spent many hours on can be emotionally difficult. Perhaps that's why you've made no more changes to ustring.c despite the problems with its interface.
ustring.c work is on hold because of the problems with pure Lua implementation design issues. I probably will include it into an API proposal and discuss it together with other API issues.
- To allow Lua to construct tag invocations, such as <ref> and <gallery>.
We could make a #tag-like function to do this, just as we do with parser functions.
I feel myself much more comfortable with the original return {expand = true} idea, which causes the wikitext to be expanded in the new Scribunto call frame.
That would lead to double-expansion in cases where text derived from input arguments need to be concatenated with wikitext to be expanded. Consider:
return { expand = true, text = formatHeader( frame.args.gallery_header ) .. '\n' .. '<gallery>' .. images .. '</gallery>' }
formatHeader( "{{{gallery_header}}}" )?
I am a bit puzzled about the "always use named arguments scheme" part, because it is not how the standard Lua library works.
It gives flexibility for future development. That was not a core principle driving the design of the standard Lua library.
Agreed.
Thanks for detailed response, Victor.