On Wed, Mar 26, 2014 at 3:21 AM, Nuria Ruiz nuria@wikimedia.org wrote:
So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that
we
might not need with a dom-based system.
String based template engines (like handlebars) do escape as a default, you have to use "special" markup for it not to escape. Can you explain in more detail what is the security concern with those?
Correct. The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()". So policy/code review is needed to say that attributes with user-supplied data must be quoted in a way compatible with the templating engine (' or " for Twig, " for Mustache since Mustache doesn't escape single quotes).
On Wed, Mar 19, 2014 at 7:51 PM, Chris Steipp csteipp@wikimedia.org wrote:
On Tue, Mar 18, 2014 at 8:27 PM, Sumana Harihareswara < sumanah@wikimedia.org
wrote:
I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?
Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like
a
document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.
Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.
Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing
around
an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates .
I
presume we want the latter, but that the former model is more
performant?
I don't want to build too much of a straw man against string-based
systems,
so it's probably more appropriate to say that the same escaping is
applied
to all strings regardless of the html context, or the developer is responsible for applying custom escaping. So for string-based systems to
be
as safe as dom ones, we also need a layer of policy and code review that
we
might not need with a dom-based system.
Performance of the dom-based systems has turned out to be not that bad,
but
performance is a major factor in any engine we go with.
We talked about this stuff in
https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21
and
https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating...
. Based on that plus
https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...
it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:
- oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this
toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.
Template approaches which are competing?:
- MVC framework - Wikia has written their own templating library that
Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework
- mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I
think?
- Knockout-compatible implementation in Node.js & PHP
https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...
and
https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...
, being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.
I think
https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_libraryc...
most of the current thinking. While Knockoff is being developed, Handlebars (and the php port of it) seems to be the leader for a string-based solution.
There's also an OutputPage refactor suggested in
https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor
that's part of the HTML Templating RFC Cluster
https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...
.
I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.
-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l