On Tue, Jun 30, 2009 at 6:08 PM, Robert Rohde<rarohde(a)gmail.com> wrote:
In addition to resource limits, any scheme better make
sure what's
passed into the programming language and what's passed out makes
sense. For example, you shouldn't have it generating raw HTML and
probably shouldn't let it mess with strip markers. Some of this may
be automatic depending how it's integrated into the parser. One would
probably also want to limit the size of an allowed output (e.g. don't
let it send 5 MB to the user). Depending on the integration there may
be other control sequences that one needs to catch when it returns as
well.
I was assuming it would just return wikitext, and that would be
integrated into the page and parsed, following all limits on wikitext
(including size) -- just as with current parser functions.
On a separate point, one of the limitations of
stand-alone type
sandboxes is that it would make it harder for the code to call other
template pages. One of the few virtues of the current template code
is that it is relatively modular, with more complex templates being
built out of less complex ones. If this programming language is meant
to replace that then it would also need to be able to reference the
results of other template pages. One solution is to pre-expand those
sections (similar to what is done now, I believe), but that can get
rather delicate once one has programming constructs like variable
assignments, looping, and recursion since the template parameters
won't necessarily be fixed at the Preprocessor stage.
I'd assume we'd support some kind of includes. One rudimentary way to
do it would be to run Lua stuff after or during preprocessing, so you
could just include Lua code macro-style using templates. A better way
would probably be to support the include features of the language
itself (I don't know how they work offhand, for Lua).
On Tue, Jun 30, 2009 at 6:12 PM, Jared
Williams<jared.williams1(a)ntlworld.com> wrote:
Yeah, would also need time & mem use restrictions.
Which is impossible for in-process use. You'd have to shell out if
you do that, which defeats the entire point of using PHP instead of
something else to begin with.
On Tue, Jun 30, 2009 at 7:16 PM, Andrew Garrett<agarrett(a)wikimedia.org> wrote:
That's just scary. We'd definitely want to do
the validation as close
as possible to the actual eval()ing, to minimise backdoors like
Special:Import et al.
You'd be saving the code to a file on disk somewhere, probably named
using a hash of the input. The only thing saving the code would be
the code that sanitizes it. There's no way anything could go wrong
unless an attacker gains filesystem write access, in which case you're
hosed anyway. Parsing PHP on every page view when you could cache it
in APC is crazy.
On Tue, Jun 30, 2009 at 7:24 PM, Hay (Husky)<huskyr(a)gmail.com> wrote:
That leaves us to Lua and Javascript, which are both
small and
efficient languages meant to solve tasks like this. Remember, i'm
talking about 'core' Javascript here, not with all DOM methods and
stuff. If you strip that all out (take a look at the 1.5. core
reference at
Mozilla.com:
https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference) you
get a pretty nice and simple language that isn't very large. Both
would require a new parser and/or installed compilers on the
server-side. Compared to the disadvantages of other options, that
seems like a pretty small loss for a great win.
Reasonable enough, yeah. Sandboxing might easier too. What are some
standalone JavaScript interpreters we could use? Ideally we'd use a
heavily-optimized JIT compiler, like V8 or TraceMonkey, but I don't
know if those work standalone.
On Tue, Jun 30, 2009 at 8:33 PM, Brion Vibber<brion(a)wikimedia.org> wrote:
That's why we want to fix it! :)
It *should* be fairly trivial to fetch a template/plugin sort of thing
off of one wiki and put it on another. Consider this as one of our goals
for next-gen templating.
Eh. Then that really ties our hands. If we have to have support for
shared hosts without exec() support, then I don't see any viable
option except sanitized PHP.
On Tue, Jun 30, 2009 at 8:37 PM, Brion Vibber<brion(a)wikimedia.org> wrote:
Executing PHP from apache-writable files saved on disk
is also a
security danger.
The original implementation of the MonoBook skin used the TAL templating
language, which was compiled into executable PHP at runtime and stored
in /tmp so it could be cached for the next view.
In addition to difficulties with hosts which had misconfigured /tmp
directories, we found that people sharing their hosts with
poorly-secured WordPress installations would end up finding their wikis
hacked -- worms exploiting vulnerabilities in other PHP apps would hop
around the system modifying any .php files they could write to...
including the cached PHPTAL templates.
It could be eval()ed by default, but the performance wins from using
APC would surely be huge. If you set it up carefully it should be
safe enough.
On Tue, Jun 30, 2009 at 8:41 PM, Brian<Brian.Mingus(a)colorado.edu> wrote:
There is nothing in the OP that indicates that we are
keeping the
current template code or even that it would be desirable. Whatever
facilities the language we choose has for including other files and
passing arguments to functions is 100% sufficient.
We're talking about changing how templates are written, not how
they're called. Changing the template call syntax is an entirely
different discussion that's orthogonal to this one.
On Tue, Jun 30, 2009 at 9:02 PM, Trevor Parscal<tparscal(a)wikimedia.org> wrote:
Seems like JSON syntax is pretty simple and could be a
big improvement
to how templates are currently invoked.
I'm not sure where you'd use JSON here?