Aryeh Gregor wrote:
On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibberbrion@wikimedia.org wrote:
- PHP
Advantage: Lots of webbish people have some experience with PHP or can easily find references.
Advantage: we're pretty much guaranteed to have a PHP interpreter available. :)
Disadvantage: PHP is difficult to lock down for secure execution.
I think it would be easy to provide a very simple locked-down version, with most of the features gone. You could, for instance, only permit variable assignment, use of built-in operators, a small whitelist of functions, and conditionals. You could omit loops, function definitions, and abusable functions like str_repeat() (let alone exec(), eval(), etc.) from a first pass. This would still be vastly more powerful, more readable, and faster than ParserFunctions.
IMO by the time you've implemented your whitelisting parser you might as well just interpret it rather than eval()ing. (And of course, eval() might be disabled on the server. :)
Looping constructs are also extremely valuable -- at a minimum in a foreach() kind of way.
I'd encourage you to consider requiring exec() support for full use of Wikipedia templates, though. Many really big shared hosts allow it, like 1and1.com. Anyone big enough to include much Wikipedia content will likely be on at least a VPS anyway.
It's not about "Wikipedia content", but about being able to grab things you see on another wiki and use or adapt them to your own needs. We get lots of questions from people trying to grab some particular template off Wikipedia to use on their own site for their own needs.
- Python
Advantage: A Python interpreter will be present on most web servers, though not necessarily all. (Windows-based servers especially.)
Wash: Python is probably better known than Lua, but not as well as PHP or JS.
Disadvantage: Like PHP, Python is difficult to lock down securely.
It doesn't matter whether it's present, does it? If the user has exec() support, they could download a binary interpreter for *any* language to their webspace and run it from there regardless of whether the language is supported on the host.
Considering the amount of trouble people have getting texvc working, I wouldn't want to force that on people just to use templates.
Much though I love Python, Lua looks like the better option. First of all, it's *very* small. sudo apt-get install lua50 on my machine uses up only 180 KB of disk space, and the package is 30 KB gzipped.
Python "comes with batteries included", which is to say it's got a huge standard library (most of which of course wouldn't be available in a restricted environment). Lua's bare interpreter of course wins in an embedded-shipping contest. :D
Our current tarballs are 10 MB; we could easily just chuck in Lua binaries for Linux x86-32 and Windows without even noticing the size increase, and allow users to enable it with one line in LocalSettings.php.
Hmm... it might be interesting to experiment with something like this, if it can _really_ be compiled standalone. (Linux binary distribution is a hellhole of incompatible linked library versions!)
It looks to me like Lua would be a lot easier to sandbox. It seems pretty simple to deny all I/O within the language itself, so you'd (hopefully) just need memory and CPU limits.
*nod* being designed as an embedded language is a win. :D
-- brion