Aryeh Gregor wrote:
On Tue, Jun 30, 2009 at 12:16 PM, Brion
Vibber<brion(a)wikimedia.org> wrote:
* PHP
Advantage: Lots of webbish people have some experience with PHP or can
easily find references.
Advantage: we're pretty much guaranteed to have a PHP interpreter
available. :)
Disadvantage: PHP is difficult to lock down for secure execution.
I think it would be easy to provide a very simple locked-down version,
with most of the features gone. You could, for instance, only permit
variable assignment, use of built-in operators, a small whitelist of
functions, and conditionals. You could omit loops, function
definitions, and abusable functions like str_repeat() (let alone
exec(), eval(), etc.) from a first pass. This would still be vastly
more powerful, more readable, and faster than ParserFunctions.
IMO by the time you've implemented your whitelisting parser you might as
well just interpret it rather than eval()ing. (And of course, eval()
might be disabled on the server. :)
Looping constructs are also extremely valuable -- at a minimum in a
foreach() kind of way.
I'd encourage you to consider requiring exec()
support for full use of
Wikipedia templates, though. Many really big shared hosts allow it,
like
1and1.com. Anyone big enough to include much Wikipedia content
will likely be on at least a VPS anyway.
It's not about "Wikipedia content", but about being able to grab things
you see on another wiki and use or adapt them to your own needs. We get
lots of questions from people trying to grab some particular template
off Wikipedia to use on their own site for their own needs.
* Python
Advantage: A Python interpreter will be present on most web servers,
though not necessarily all. (Windows-based servers especially.)
Wash: Python is probably better known than Lua, but not as well as PHP
or JS.
Disadvantage: Like PHP, Python is difficult to lock down securely.
It doesn't matter whether it's present, does it? If the user has
exec() support, they could download a binary interpreter for *any*
language to their webspace and run it from there regardless of whether
the language is supported on the host.
Considering the amount of trouble people have getting texvc working, I
wouldn't want to force that on people just to use templates.
Much though I love Python, Lua looks like the better
option. First of
all, it's *very* small. sudo apt-get install lua50 on my machine uses
up only 180 KB of disk space, and the package is 30 KB gzipped.
Python "comes with batteries included", which is to say it's got a huge
standard library (most of which of course wouldn't be available in a
restricted environment). Lua's bare interpreter of course wins in an
embedded-shipping contest. :D
Our
current tarballs are 10 MB; we could easily just chuck in Lua binaries
for Linux x86-32 and Windows without even noticing the size increase,
and allow users to enable it with one line in LocalSettings.php.
Hmm... it might be interesting to experiment with something like this,
if it can _really_ be compiled standalone. (Linux binary distribution is
a hellhole of incompatible linked library versions!)
It looks to me like Lua would be a lot easier to
sandbox. It seems
pretty simple to deny all I/O within the language itself, so you'd
(hopefully) just need memory and CPU limits.
*nod* being designed as an embedded language is a win. :D
-- brion