HTML templating systems & MediaWiki - is this summary right? - Wikitech-l

List overview All Threads
Download

newer

HTML templating systems & MediaWiki - is this summary right?

older

workflow to add multiple patches...

Change the installer to make...

Sumana Harihareswara

18 Mar 2014 18 Mar '14

8:27 p.m.

I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like a document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing around an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates . I presume we want the latter, but that the former model is more performant?

We talked about this stuff in https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21 and https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating... . Based on that plus https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

* oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?: * MVC framework - Wikia has written their own templating library that Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework * mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I think? * Knockout-compatible implementation in Node.js & PHP https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... and https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... , being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

There's also an OutputPage refactor suggested in https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor that's part of the HTML Templating RFC Cluster https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... .

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Show replies by date

Peter Kaminski

19 Mar 19 Mar

1:10 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

Hi Sumana,

I think a key concept you might want to capture is "separation of concerns" -- a templating engine allows separation of presentation logic from business logic. Often, the two are handled by different people with different skills, in service of separate goals. So having the templating engine specialized for presentation logic is important.

The point isn't so much that the templates look like a document, as much as they can be written in a simplified language that's specialized for outputting documents.

Also, I don't know if these are useful in this context, but I wanted to point to two of the cutting-edge template engines from the PHP frameworks world, as representatives of modern PHP template thinking:

Fabien Potencier's Twig http://twig.sensiolabs.org/

Laravel's Blade http://laravel.com/docs/templates#blade-templating http://culttt.com/2013/09/02/using-blade-laravel-4/

Neither of these, though, are oriented to dual JavaScript/PHP support, which I think is an interesting path to consider.

And last, two Wikipedia pages that might be relevant:

https://en.wikipedia.org/wiki/Web_template_system https://en.wikipedia.org/wiki/Comparison_of_web_template_engines

Pete

On 3/18/14, 20:27 PM, Sumana Harihareswara wrote:

...

I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like a document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing around an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates . I presume we want the latter, but that the former model is more performant?

We talked about this stuff in https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21 and https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating... . Based on that plus https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I think?

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... and https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... , being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

There's also an OutputPage refactor suggested in https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor that's part of the HTML Templating RFC Cluster https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... .

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

Dmitriy Sintsov

3:12 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 19.03.2014 12:10, Peter Kaminski wrote:

...

Hi Sumana,

I think a key concept you might want to capture is "separation of concerns" -- a templating engine allows separation of presentation logic from business logic. Often, the two are handled by different people with different skills, in service of separate goals. So having the templating engine specialized for presentation logic is important.

The point isn't so much that the templates look like a document, as much as they can be written in a simplified language that's specialized for outputting documents.

Also, I don't know if these are useful in this context, but I wanted to point to two of the cutting-edge template engines from the PHP frameworks world, as representatives of modern PHP template thinking:

Fabien Potencier's Twig http://twig.sensiolabs.org/

Laravel's Blade http://laravel.com/docs/templates#blade-templating http://culttt.com/2013/09/02/using-blade-laravel-4/

Neither of these, though, are oriented to dual JavaScript/PHP support, which I think is an interesting path to consider.

And last, two Wikipedia pages that might be relevant:

https://en.wikipedia.org/wiki/Web_template_system https://en.wikipedia.org/wiki/Comparison_of_web_template_engines

Pete

MediaWiki Parser itself could be used for skinning as well with some features disabled. But actually just associative nested arrays are good enough as template engine, which are used for example in Drupal: https://drupal.org/node/930760

|// New method of generating the render array and returning that function mymodule_ra_page() { $output = array( 'first_para' => array( '#type' => 'markup', '#markup' => '<p>A paragraph about some stuff...</p>', ), 'second_para' => array( '#items' => array('first item', 'second item', 'third item'), '#theme' => 'item_list', ), ); return $output; }|

Each key of such associative array can be mapped to view method, while it's values are view parameters (properties). Quite flexible, fast and powerful approach which also allows to late manipulation of Output. Dmitriy

Chris Steipp

11:51 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Tue, Mar 18, 2014 at 8:27 PM, Sumana Harihareswara <sumanah@wikimedia.org

...

wrote:

...

I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like a document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing around an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates . I presume we want the latter, but that the former model is more performant?

I don't want to build too much of a straw man against string-based systems, so it's probably more appropriate to say that the same escaping is applied to all strings regardless of the html context, or the developer is responsible for applying custom escaping. So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that we might not need with a dom-based system.

Performance of the dom-based systems has turned out to be not that bad, but performance is a major factor in any engine we go with.

...

We talked about this stuff in https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21 and

https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating... . Based on that plus

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I think?

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... and

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... , being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

I think https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_libraryc... most of the current thinking. While Knockoff is being developed, Handlebars (and the php port of it) seems to be the leader for a string-based solution.

...

There's also an OutputPage refactor suggested in https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor that's part of the HTML Templating RFC Cluster

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... .

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

...

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Nuria Ruiz

26 Mar 26 Mar

3:21 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that we might not need with a dom-based system.

String based template engines (like handlebars) do escape as a default, you have to use "special" markup for it not to escape. Can you explain in more detail what is the security concern with those?

On Wed, Mar 19, 2014 at 7:51 PM, Chris Steipp csteipp@wikimedia.org wrote:

...

On Tue, Mar 18, 2014 at 8:27 PM, Sumana Harihareswara < sumanah@wikimedia.org

...
wrote:

...
I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like a document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing around an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates . I presume we want the latter, but that the former model is more performant?

I don't want to build too much of a straw man against string-based systems, so it's probably more appropriate to say that the same escaping is applied to all strings regardless of the html context, or the developer is responsible for applying custom escaping. So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that we might not need with a dom-based system.

Performance of the dom-based systems has turned out to be not that bad, but performance is a major factor in any engine we go with.

...
We talked about this stuff in

https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21

...
and

https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating...

...
. Based on that plus

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I

think?

...

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
and

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
, being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

I think

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_libraryc... most of the current thinking. While Knockoff is being developed, Handlebars (and the php port of it) seems to be the leader for a string-based solution.

...
There's also an OutputPage refactor suggested in https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor that's part of the HTML Templating RFC Cluster

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
.

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

...
-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Chris Steipp

6:43 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Mar 26, 2014 at 3:21 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...

...
So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that

we

...
might not need with a dom-based system.

String based template engines (like handlebars) do escape as a default, you have to use "special" markup for it not to escape. Can you explain in more detail what is the security concern with those?

Correct. The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()". So policy/code review is needed to say that attributes with user-supplied data must be quoted in a way compatible with the templating engine (' or " for Twig, " for Mustache since Mustache doesn't escape single quotes).

...

On Wed, Mar 19, 2014 at 7:51 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Tue, Mar 18, 2014 at 8:27 PM, Sumana Harihareswara < sumanah@wikimedia.org

...
wrote:

...
I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like

a

...
...
document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing

around

...
...
an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates .

I

...
...
presume we want the latter, but that the former model is more

performant?

...
...
I don't want to build too much of a straw man against string-based

systems,

...
so it's probably more appropriate to say that the same escaping is

applied

...
to all strings regardless of the html context, or the developer is responsible for applying custom escaping. So for string-based systems to

be

...
as safe as dom ones, we also need a layer of policy and code review that

we

...
might not need with a dom-based system.

Performance of the dom-based systems has turned out to be not that bad,

but

...
performance is a major factor in any engine we go with.

...
We talked about this stuff in

https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21

...
...
and

https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating...

...
...
. Based on that plus

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
...
it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I

think?

...

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
...
and

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
...
, being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

I think

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_libraryc...

...
most of the current thinking. While Knockoff is being developed, Handlebars (and the php port of it) seems to be the leader for a string-based solution.

...
There's also an OutputPage refactor suggested in

https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor

...
...
that's part of the HTML Templating RFC Cluster

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
...
.

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

...
-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Nuria Ruiz

9:32 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

because it only escapes HTML by default. Now, note that the problem can be fixed with <div class={{makeStringSafe something}}>

Where "makestringSafe" is a function defined by us and executed there that escapes to our liking.

This is equivalent to what we would need to do in PHP server side if we used PHP native templating. We would need to implement an escaping function like "makeStringSafe" that we would execute like:

<div class="View::makeStringSafe($this->something)"></div>

What I wanted to clarify (regarding Sumana's first e-mail) is that when it comes to security we would need to take the same precautions with a string based javascript engine than we would do with any PHP engine we choose. Namely, as you said, spot the lack of "makestringSafe" via CRs.

To be completely fair, the "makeStringSafe" could be inserted with a build/template compilation process and thus we would not need CRs but rather we could rely on automation.

On Wed, Mar 26, 2014 at 2:43 PM, Chris Steipp csteipp@wikimedia.org wrote:

...

On Wed, Mar 26, 2014 at 3:21 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...
...
So for string-based systems to be as safe as dom ones, we also need a layer of policy and code review that

we

...
might not need with a dom-based system.

String based template engines (like handlebars) do escape as a default,

you

...
have to use "special" markup for it not to escape. Can you explain in

more

...
detail what is the security concern with those?

Correct. The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()". So policy/code review is needed to say that attributes with user-supplied data must be quoted in a way compatible with the templating engine (' or " for Twig, " for Mustache since Mustache doesn't escape single quotes).

...
On Wed, Mar 19, 2014 at 7:51 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Tue, Mar 18, 2014 at 8:27 PM, Sumana Harihareswara < sumanah@wikimedia.org

...
wrote:

...
I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm

gonna

...
...
...
note what I think I understand so far in this mail and then would

love

...
...
...
for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more

like

...
a

...
...
document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke

code,

...
...
...
then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP

world,

...
...
...
Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of

the

...
...
...
MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string

("here's a

...
...
...
bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing

around

...
...
an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates.

I

...
...
presume we want the latter, but that the former model is more

performant?

...
...
I don't want to build too much of a straw man against string-based

systems,

...
so it's probably more appropriate to say that the same escaping is

applied

...
to all strings regardless of the html context, or the developer is responsible for applying custom escaping. So for string-based systems

to

...
be

...
as safe as dom ones, we also need a layer of policy and code review

that

...
we

...
might not need with a dom-based system.

Performance of the dom-based systems has turned out to be not that bad,

but

...
performance is a major factor in any engine we go with.

...
We talked about this stuff in

https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21

...
...
...
and

https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating...

...
...
...
. Based on that plus

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
...
...
it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not

sure

...
...
...
whether any other MediaWiki extensions or teams are using it? This

is a

...
...
...
DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in

the

...
...
...
RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I

think?

...

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
...
...
and

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/...

...
...
...
, being worked on by Gabriel Wicke, Matt Walker, and others.

DOM-based.

...
...
...
I think

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_libraryc...

...
...
most of the current thinking. While Knockoff is being developed, Handlebars (and the php port of it) seems to be the leader for a string-based solution.

...
There's also an OutputPage refactor suggested in

https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor

...
...
that's part of the HTML Templating RFC Cluster

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te...

...
...
...
.

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

...
-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Daniel Friesen

9:44 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...

...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div class={{makeStringSafe something}}>

Where "makestringSafe" is a function defined by us and executed there that escapes to our liking.

How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

C. Scott Ananian

10:08 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

Note that my favorite "handlebars-like" template engine is currently "spacebars", developed as part of the meteor.js project. This does safe structured interpolation, so it's not really a "string-based" template engine any more -- but it still contains the same minimalist markup (it basically looks identical to handlebars, it's just a new implementation).

I've worked some with gwicke on spacebars support for his templating project.

The other important aspect which I haven't seen mentioned yet is editor support. Where does using the template system fall, on a scale with "writing an article for humans" on one side and "writing executable code" on the other?

Even this is not necessarily straightforward to assess. As I understand it, one of the advantages of KnockOff is that, although at a raw HTML level it looks cumbersome for a human to author, it is structured in a way that would make it easier to integrate with something like Visual Editor, with simple properties that can be added to sample text to turn it into a template.

I personally lean toward "handlebars"-style engines, because the extremely minimalist syntax makes it easy for non-coders to author directly. A user-friendly editor for such a template language would probably expose separate "content" and "code" views of a template. Basic templates wouldn't have any code, but advanced templates would use something like Scribunto for easy editing of the code associated with a template. --scott

-- (http://cscott.net)

Chris Steipp

10:15 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...

On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div>

is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div class={{makeStringSafe something}}>

Where "makestringSafe" is a function defined by us and executed there

that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe" functions, and code review would have to make sure the right one was being used in the right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it sucks):

$html = Html::element( 'div', array( 'class' => $anything ), $anythingElse );

The developer doesn't need to have any knowledge of what escaping needs to apply to the class attribute vs the text.

...

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

Nuria Ruiz

10:30 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But what template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

...

$html = Html::element( 'div', array( 'class' => $anything ), $anythingElse

I see. Sorry but where I disagree is that the "quote me this replacement" is a lawful case for the template engine. The line above is doing a lot more than purely templating and on my opinion it does little to separate data and markup. Which is the very point of having a template engine.

But if you consider that one a lawful use case, you are right. The example I provided does not help you.

On Wed, Mar 26, 2014 at 6:15 PM, Chris Steipp csteipp@wikimedia.org wrote:

...

On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...
On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div

class={{something}}></div>

...
is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div

class={{makeStringSafe

...
...
something}}>

Where "makestringSafe" is a function defined by us and executed there

that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe" functions, and code review would have to make sure the right one was being used in the right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it sucks):

$html = Html::element( 'div', array( 'class' => $anything ), $anythingElse );

The developer doesn't need to have any knowledge of what escaping needs to apply to the class attribute vs the text.

...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Chris Steipp

10:41 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Mar 26, 2014 at 10:30 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...

...
Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But what template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

...
$html = Html::element( 'div', array( 'class' => $anything ), $anythingElse

I see. Sorry but where I disagree is that the "quote me this replacement" is a lawful case for the template engine.

I'm not sure I understand what you're saying here. Do you mean makesafeString in your example shouldn't quote the text, but should instead remove space characters?

...

The line above is doing a lot more than purely templating and on my opinion it does little to separate data and markup. Which is the very point of having a template engine.

But if you consider that one a lawful use case, you are right. The example I provided does not help you.

On Wed, Mar 26, 2014 at 6:15 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...
On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div

class={{something}}></div>

...
is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div

class={{makeStringSafe

...
...
something}}>

Where "makestringSafe" is a function defined by us and executed there

that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache

template

...
...
fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe" functions, and code review would have to make sure the right one was being used in

the

...
right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it sucks):

$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
);

The developer doesn't need to have any knowledge of what escaping needs

to

...
apply to the class attribute vs the text.

...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/

]

...
...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Nuria Ruiz

30 Mar 30 Mar

2:23 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

...
<div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()"

...

...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...

...
I see. Sorry but where I disagree is that the "quote me this replacement" is a lawful case for the template engine.

...

I'm not sure I understand what you're saying here. Do you mean makesafeString in your example shouldn't quote the text, but should instead remove space characters?=

What I am saying is that the parsing and escaping scheme we need is much simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that with security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data. Thus what I am arguing is that the template engine should not support the use case of "add any random attribute or javascript to my html element with the right set of quotes" as a "lawful" use case. The template engine should not be expected to parse and insert code and "onclick" is code.

With a new template engine our main objective should be to separate data from markup, not supporting an style of coding that includes "onClick" attributes mixed with HTML which was prevalent years ago or css classes mixed with controller code.

On my experience reducing use cases for template engine to just data handling while having specific functions that deal with links and translations simplifies the escaping problem greatly as you do not need context aware escaping. You can "js-escape" any piece of data sent to the engine cause you know you do not support the use case of sending javascript to be inserted.

On Wed, Mar 26, 2014 at 6:41 PM, Chris Steipp csteipp@wikimedia.org wrote:

...

On Wed, Mar 26, 2014 at 10:30 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...
...
Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But

what

...
template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
I see. Sorry but where I disagree is that the "quote me this replacement" is a lawful case for the template engine.

I'm not sure I understand what you're saying here. Do you mean makesafeString in your example shouldn't quote the text, but should instead remove space characters?

...
The line above is doing a lot more than purely templating and on my opinion it does little to separate data and markup. Which is the very point of having a template engine.

But if you consider that one a lawful use case, you are right. The

example

...
I provided does not help you.

On Wed, Mar 26, 2014 at 6:15 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...
On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div

class={{something}}></div>

...
is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div

class={{makeStringSafe

...
...
something}}>

Where "makestringSafe" is a function defined by us and executed

there

...
...
...
that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache

template

...
...
fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and

Mustache

...
...
...
isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class vs.

an

...
...
href vs. a parameter that is inserted into a url vs. an id attribute

are

...
...
all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe"

functions,

...
...
and code review would have to make sure the right one was being used in

the

...
right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it

sucks):

...
...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
);

The developer doesn't need to have any knowledge of what escaping needs

to

...
apply to the class attribute vs the text.

...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [

http://danielfriesen.name/

...
]

...
...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Tyler Romeo

4:03 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

Allow me to just put this out there: using handlerbars or mustache or anything similar to it is a *terrible* idea for MediaWiki. {{this is escaped}} and {{{this is not escaped}}}. The only difference is an extra brace on each side, and considering how many developers here are also familiar with writing wikitext, the probability of an accidental security vulnerability would increase significantly.

If we were to use a string-based templating engine (and looking at the progress of gwicke's work, it's more likely we'll go DOM-based), we'd want something that, at the very least, does not give the opportunity for a screwup like this.

*-- * *Tyler Romeo* Stevens Institute of Technology, Class of 2016 Major in Computer Science

On Sun, Mar 30, 2014 at 5:23 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...

...
...
<div class={{something}}></div> is vulnerable, if something is set to "1234 onClick=doSomething()"

...
...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
...
I see. Sorry but where I disagree is that the "quote me this

replacement"

...
...
is a lawful case for the template engine.

...
I'm not sure I understand what you're saying here. Do you mean makesafeString in your example shouldn't quote the text, but should

instead

...
remove space characters?=

What I am saying is that the parsing and escaping scheme we need is much simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that with security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data. Thus what I am arguing is that the template engine should not support the use case of "add any random attribute or javascript to my html element with the right set of quotes" as a "lawful" use case. The template engine should not be expected to parse and insert code and "onclick" is code.

With a new template engine our main objective should be to separate data from markup, not supporting an style of coding that includes "onClick" attributes mixed with HTML which was prevalent years ago or css classes mixed with controller code.

On my experience reducing use cases for template engine to just data handling while having specific functions that deal with links and translations simplifies the escaping problem greatly as you do not need context aware escaping. You can "js-escape" any piece of data sent to the engine cause you know you do not support the use case of sending javascript to be inserted.

On Wed, Mar 26, 2014 at 6:41 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Wed, Mar 26, 2014 at 10:30 AM, Nuria Ruiz nuria@wikimedia.org

wrote:

...
...
...
Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute

are

...
...
...
all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But

what

...
template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
I see. Sorry but where I disagree is that the "quote me this

replacement"

...
...
is a lawful case for the template engine.

I'm not sure I understand what you're saying here. Do you mean makesafeString in your example shouldn't quote the text, but should

instead

...
remove space characters?

...
The line above is doing a lot more than purely templating and on my opinion it does little to

separate

...
...
data and markup. Which is the very point of having a template engine.

But if you consider that one a lawful use case, you are right. The

example

...
I provided does not help you.

On Wed, Mar 26, 2014 at 6:15 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...
On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
> The issue is that they apply the same escaping, regardless of

the

...
...
...
...
...
> html context. So, in Twig and mustache, <div

class={{something}}></div>

...
is

...
> vulnerable, if something is set to "1234 onClick=doSomething()". Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div

class={{makeStringSafe

...
...
something}}>

Where "makestringSafe" is a function defined by us and executed

there

...
...
...
that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache

template

...
...
fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and

Mustache

...
...
...
isn't context sensitive so neither Mustache or a custom function

know

...
...
...
...
that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class

vs.

...
an

...
...
href vs. a parameter that is inserted into a url vs. an id attribute

are

...
...
all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe"

functions,

...
...
and code review would have to make sure the right one was being used

in

...
...
the

...
right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it

sucks):

...
...
$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
);

The developer doesn't need to have any knowledge of what escaping

needs

...
...
to

...
apply to the class attribute vs the text.

...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [

http://danielfriesen.name/

...
]

...
...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gabriel Wicke

1 Apr 1 Apr

1:23 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 03/30/2014 02:23 AM, Nuria Ruiz wrote:

...

What I am saying is that the parsing and escaping scheme we need is much simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that with security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data.

The example might not have been the most helpful one. Consider a handlebars template like this:

<a href="{{url}}">{{title}}</a>

Even with double-stashes you'll be in trouble if your url data happens to be 'javascript:alert(cookie)'. For this you need special and ideally automatic sanitization for href attributes (and src & style), which is what KnockOff/TAssembly provides.

Gabriel

Nuria Ruiz

2 Apr 2 Apr

4:33 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

...
The example might not have been the most helpful one. Consider a

handlebars

...

...
template like this: <a href="{{url}}">{{title}}</a>

True, much better example to state the point. Now, as I think I mentioned earlier there are two cases that need to be treated differently than anything else: links and translations/localizations.

In this case I wouldn't want the url (or translation) to be plainly parsed. Rather I would do:

<a href="{{urlBuilder p1=param1 p2=param2}}">{{title}}</a>

Where urlBuilder is a user defined function that decides on "lawful" input and output scheme.

This would work just the same for translations {{translateGender maleTranslation femaleTranslation name=param}} where translateGender is also defined by us.

But these are basically the two only schemes you need to treat differently, the context in these two cases is very precise and thus much more manageable.

...

...
For this you need special and ideally automatic sanitization for href

attributes (and src & style), which is >>what KnockOff/TAssembly provides. Sure, that works just as well. But overall is a pretty similar solution to having a url builder function executed from the template engine with the drawback that is less performant. I know you guys are set on the DOM based engine but maybe it is worth thinking how to fit client side translations on that scheme as translations bring their own escaping problems (if you have done so please disregard).

My bigger point was to highlite that with a string concatenation engine you can satisfy security concerns plus have a template engine that performs really well if you respect the data and markup separation.

On Tue, Apr 1, 2014 at 10:23 PM, Gabriel Wicke gwicke@wikimedia.org wrote:

...

On 03/30/2014 02:23 AM, Nuria Ruiz wrote:

...
What I am saying is that the parsing and escaping scheme we need is much simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that with security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data.

The example might not have been the most helpful one. Consider a handlebars template like this:

<a href="{{url}}">{{title}}</a>

Even with double-stashes you'll be in trouble if your url data happens to be 'javascript:alert(cookie)'. For this you need special and ideally automatic sanitization for href attributes (and src & style), which is what KnockOff/TAssembly provides.

Gabriel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

C. Scott Ananian

6:28 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

Are you calling handlebars a "string concatenation engine"? In the spacebars implementation (and my/gwicke's prototype) it is a structured DOM engine. Don't confuse surface syntax with implementation. --scott On Apr 2, 2014 7:33 AM, "Nuria Ruiz" nuria@wikimedia.org wrote:

...

...
...
The example might not have been the most helpful one. Consider a

handlebars

...
...
template like this: <a href="{{url}}">{{title}}</a>

True, much better example to state the point. Now, as I think I mentioned earlier there are two cases that need to be treated differently than anything else: links and translations/localizations.

In this case I wouldn't want the url (or translation) to be plainly parsed. Rather I would do:

<a href="{{urlBuilder p1=param1 p2=param2}}">{{title}}</a>

Where urlBuilder is a user defined function that decides on "lawful" input and output scheme.

This would work just the same for translations {{translateGender maleTranslation femaleTranslation name=param}} where translateGender is also defined by us.

But these are basically the two only schemes you need to treat differently, the context in these two cases is very precise and thus much more manageable.

...
...
For this you need special and ideally automatic sanitization for href

attributes (and src & style), which is >>what KnockOff/TAssembly provides. Sure, that works just as well. But overall is a pretty similar solution to having a url builder function executed from the template engine with the drawback that is less performant. I know you guys are set on the DOM based engine but maybe it is worth thinking how to fit client side translations on that scheme as translations bring their own escaping problems (if you have done so please disregard).

My bigger point was to highlite that with a string concatenation engine you can satisfy security concerns plus have a template engine that performs really well if you respect the data and markup separation.

On Tue, Apr 1, 2014 at 10:23 PM, Gabriel Wicke gwicke@wikimedia.org wrote:

...
On 03/30/2014 02:23 AM, Nuria Ruiz wrote:

...
What I am saying is that the parsing and escaping scheme we need is

much

...
...
simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that with security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data.

The example might not have been the most helpful one. Consider a

handlebars

...
template like this:

<a href="{{url}}">{{title}}</a>

Even with double-stashes you'll be in trouble if your url data happens to be 'javascript:alert(cookie)'. For this you need special and ideally

automatic

...
sanitization for href attributes (and src & style), which is what KnockOff/TAssembly provides.

Gabriel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Nuria Ruiz

7:16 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

...
...
...
Are you calling handlebars a "string concatenation engine"?

I think I just invented this "string concatenation engine" expression but ...yes. Handlebars uses a lexical parser but once templates are compiled parameter substitution is done concatenating strings. I was referring to this second step.

Example Template: <h1>Handlebars JS Example</h1> <script id="some-template" type="text/x-handlebars-template"> <table> <div> Name {{name}} </div> </script>

Template once compiled:

On Wed, Apr 2, 2014 at 3:28 PM, C. Scott Ananian cananian@wikimedia.orgwrote:

...

Are you calling handlebars a "string concatenation engine"? In the spacebars implementation (and my/gwicke's prototype) it is a structured DOM engine. Don't confuse surface syntax with implementation. --scott On Apr 2, 2014 7:33 AM, "Nuria Ruiz" nuria@wikimedia.org wrote:

...
...
...
The example might not have been the most helpful one. Consider a

handlebars

...
...
template like this: <a href="{{url}}">{{title}}</a>

True, much better example to state the point. Now, as I think I

mentioned

...
earlier there are two cases that need to be treated differently than anything else: links and translations/localizations.

In this case I wouldn't want the url (or translation) to be plainly

parsed.

...
Rather I would do:

<a href="{{urlBuilder p1=param1 p2=param2}}">{{title}}</a>

Where urlBuilder is a user defined function that decides on "lawful"

input

...
and output scheme.

This would work just the same for translations {{translateGender maleTranslation femaleTranslation name=param}} where translateGender is also defined by us.

But these are basically the two only schemes you need to treat differently, the context in these two cases is very precise and thus much more manageable.

...
...
For this you need special and ideally automatic sanitization for href

attributes (and src & style), which is >>what KnockOff/TAssembly

provides.

...
Sure, that works just as well. But overall is a pretty similar solution

to

...
having a url builder function executed from the template engine with the drawback that is less performant. I know you guys are set on the DOM

based

...
engine but maybe it is worth thinking how to fit client side translations on that scheme as translations bring their own escaping problems (if you have done so please disregard).

My bigger point was to highlite that with a string concatenation engine

you

...
can satisfy security concerns plus have a template engine that performs really well if you respect the data and markup separation.

On Tue, Apr 1, 2014 at 10:23 PM, Gabriel Wicke gwicke@wikimedia.org wrote:

...
On 03/30/2014 02:23 AM, Nuria Ruiz wrote:

...
What I am saying is that the parsing and escaping scheme we need is

much

...
...
simpler if you disallow the use case of passing the template engine something that is not data.

Let me explain as this as it has to do more with correctness that

with

...
...
...
security per se: A template engine objective is to separate data from markup. In your example you are passing the template 'class="anything"' or 'onclick="something"' neither "class" nor "onclick" are data.

The example might not have been the most helpful one. Consider a

handlebars

...
template like this:

<a href="{{url}}">{{title}}</a>

Even with double-stashes you'll be in trouble if your url data happens

to

...
...
be 'javascript:alert(cookie)'. For this you need special and ideally

automatic

...
sanitization for href attributes (and src & style), which is what KnockOff/TAssembly provides.

Gabriel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gabriel Wicke

9:19 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 04/02/2014 04:33 AM, Nuria Ruiz wrote:

...

My bigger point was to highlite that with a string concatenation engine you can satisfy security concerns plus have a template engine that performs really well if you respect the data and markup separation.

The runtime is string-based for performance in both cases. That's what makes TAssembly so fast [2]. The difference is that the DOM-based KnockOff compiler systematically enforces DOM balancing and attribute sanitization, while without such a compiler you have to do so manually.

Gabriel

[1]: https://www.mediawiki.org/wiki/Talk:Requests_for_comment/HTML_templating_lib... [2]: https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library#...

Chris Steipp

26 Mar 26 Mar

10:55 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Mar 26, 2014 at 10:30 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...

...
Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But what template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

When looking at a typical web page, you need several escaping strategies. OWASP roughly groups them into html body, plain attributes, URL context, Javascript context, CSS context. My point was that you need several MakeWhateverSafe functions, and have to use them in the right context. So that is a long way of saying I disagree with you when you said that this could be automated without some process having knowledge of the html context and verifying the right escaping is being applied.

...

...
$html = Html::element( 'div', array( 'class' => $anything ), $anythingElse

I see. Sorry but where I disagree is that the "quote me this replacement" is a lawful case for the template engine. The line above is doing a lot more than purely templating and on my opinion it does little to separate data and markup. Which is the very point of having a template engine.

But if you consider that one a lawful use case, you are right. The example I provided does not help you.

On Wed, Mar 26, 2014 at 6:15 PM, Chris Steipp csteipp@wikimedia.org wrote:

...
On Wed, Mar 26, 2014 at 9:44 AM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...
On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div

class={{something}}></div>

...
is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div

class={{makeStringSafe

...
...
something}}>

Where "makestringSafe" is a function defined by us and executed there

that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache

template

...
...
fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Exactly. Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies. So there would be many different "makeXXXXStringSafe" and probably "quoteAndMakeXXXXStringSafe" functions, and code review would have to make sure the right one was being used in

the

...
right place. Which means someone who is familiar with all of the xss techniques would need to code review almost all the templates.

For comparison, using our current html templating (as much as it sucks):

$html = Html::element( 'div', array( 'class' => $anything ),

$anythingElse

...
);

The developer doesn't need to have any knowledge of what escaping needs

to

...
apply to the class attribute vs the text.

...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/

]

...
...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gabriel Wicke

11:56 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 03/26/2014 10:55 AM, Chris Steipp wrote:

...

On Wed, Mar 26, 2014 at 10:30 AM, Nuria Ruiz nuria@wikimedia.org wrote:

...
...
Additionally, how you escape a plain parameter like class vs. an href vs. a parameter that is inserted into a url vs. an id attribute are all different escaping strategies.

Urls in the template engine need to be handled on their own, sure. But what template engine does not work in this fashion? There are three separate "entities" you normally deal with when doing replacement: translations, urls and plain attributes.

When looking at a typical web page, you need several escaping strategies. OWASP roughly groups them into html body, plain attributes, URL context, Javascript context, CSS context. My point was that you need several MakeWhateverSafe functions, and have to use them in the right context. So that is a long way of saying I disagree with you when you said that this could be automated without some process having knowledge of the html context and verifying the right escaping is being applied.

When compiling from DOM to the TAssembly JSON IR we encode the attribute context in the 'attr' binding. While executing this binding TAssembly automatically escapes href / src and style attributes using the same sanitization logic as used in Parsoid, which in turn is a direct port of MediaWiki's Sanitizer.php. Despite offering this level of security support it is the fastest library in our benchmarks.

KnockOff is compiling to TAssembly from KnockoutJS syntax, but other front-end syntaxes are possible like cscott's Spacebars to TAssembly compiler. We chose the KnockoutJS syntax primarily for its ease of implementation (the expression grammar is 70 lines, and DOM parsing is readily available). It also supports powerful and general parameter passing which is useful for things like i18n, and has good potential support for server-side pre-expansion followed by client-side updates by virtue of its attribute syntax.

Last night Matt has been back working on the PHP port of TAssembly. We'll probably have an update on this in the next days.

Gabriel

Nuria Ruiz

10:15 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

...

How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Sorry but I think you might have missunderstood Chris' example. Attributes should not need any quoting, that is not a real use case. Place holders are replaced by attributes that might be extra-escaped but in any case the template engine should infer anything as to the content being replaced.

The expected outcome after substitution should be: <div class=some-escaped-text> </div>

On Wed, Mar 26, 2014 at 5:44 PM, Daniel Friesen daniel@nadir-seen-fire.comwrote:

...

On 2014-03-26, 9:32 AM, Nuria Ruiz wrote:

...
...
The issue is that they apply the same escaping, regardless of the html context. So, in Twig and mustache, <div class={{something}}></div>

is

...
...
vulnerable, if something is set to "1234 onClick=doSomething()".

Right, the engine would render:

<div class=1234 onClick=doSomething()> </div>

because it only escapes HTML by default. Now, note that the problem can be fixed with <div class={{makeStringSafe something}}>

Where "makestringSafe" is a function defined by us and executed there

that

...
escapes to our liking.

How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Daniel Friesen

10:28 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 2014-03-26, 10:15 AM, Nuria Ruiz wrote:

...

...
How does a custom function jammed into the middle of a Mustache template fix the issue when the issue is not that foo={{something}} doesn't escape, but is that quoting is needed instead of escaping, and Mustache isn't context sensitive so neither Mustache or a custom function know that foo={{something}} is an attribute value in need of quoting?

Sorry but I think you might have missunderstood Chris' example. Attributes should not need any quoting, that is not a real use case. Place holders are replaced by attributes that might be extra-escaped but in any case the template engine should infer anything as to the content being replaced.

The expected outcome after substitution should be: <div class=some-escaped-text> </div>

And Chris explained that if 'something' was 'some-text onclick=doSomething()' instead of 'some-text' then instead of:

The template engine would output:

Creating an XSS vector.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

Gabriel Wicke

19 Mar 19 Mar

6:52 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

We made some good progress on KnockOff [1,2] recently. It is currently the fastest library in our micro benchmarks [3] despite having a DOM-based compiler with the associated security advantages. Matt has started work on the PHP port before going on vacation, but I expect that we'll have a PHP runtime next week as well. The runtime code is still small at 337 lines.

Cheers,

Gabriel

[1]: https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... [2]: https://github.com/gwicke/knockoff [3]: https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library#...

Sumana Harihareswara

1 Apr 1 Apr

10:55 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 03/19/2014 09:52 PM, Gabriel Wicke wrote:

...

We made some good progress on KnockOff [1,2] recently. It is currently the fastest library in our micro benchmarks [3] despite having a DOM-based compiler with the associated security advantages. Matt has started work on the PHP port before going on vacation, but I expect that we'll have a PHP runtime next week as well. The runtime code is still small at 337 lines.

Cheers,

Gabriel

Gabriel, Matt - is the PHP runtime ready? Want to talk about it in this week's RfC meeting?

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Gabriel Wicke

1:33 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 04/01/2014 10:55 AM, Sumana Harihareswara wrote:

...

Gabriel, Matt - is the PHP runtime ready?

At this point it supports only a part of the TAssembly spec: https://github.com/mattofak/knockoff

Blame other stuff getting in the way. Matt or me should find some time to knock out (ha!) the remaining bits this week.

...

Want to talk about it in this week's RfC meeting?

I don't see a huge case for discussing this in the RFC meeting. It's mostly about the implementation at this point, and IMO code review and pull requests are a better place to discuss that. We'll post benchmark results when we have them.

Do you see anything that you feel would be better discussed in an RFC review?

Gabriel

Rob Lanphier

3:49 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Tue, Apr 1, 2014 at 1:33 PM, Gabriel Wicke gwicke@wikimedia.org wrote:

...

I don't see a huge case for discussing this in the RFC meeting. It's mostly about the implementation at this point, and IMO code review and pull requests are a better place to discuss that. We'll post benchmark results when we have them.

Do you see anything that you feel would be better discussed in an RFC review?

I'm eager to get some closure on the overall RFC about HTML templating myself. Am I right to assume that the process is: 1. Get Knockoff complete enough that we can fairly evaluate it against the other proposals 2. Reopen the conversation about various alternatives 3. Pick something

Correct?

Rob

Gabriel Wicke

4:10 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 04/01/2014 03:49 PM, Rob Lanphier wrote:

...

I'm eager to get some closure on the overall RFC about HTML templating myself. Am I right to assume that the process is:

Get Knockoff complete enough that we can fairly evaluate it against the

other proposals 2. Reopen the conversation about various alternatives 3. Pick something

Yup, that's pretty much it.

Gabriel

Ryan Kaldari

5:09 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

The mobile web team will be evaluating Gabriel's KnockoutJS template implementation sometime between April 14 and April 28. The things we will be looking at include: how well it will work for mobile's current templating needs, how appropriate it is for mobile delivery, and how much effort would be involved in migrating our existing templates to it. We'll update the list with our findings then.

Ryan Kaldari

On Tue, Apr 1, 2014 at 4:10 PM, Gabriel Wicke gwicke@wikimedia.org wrote:

...

On 04/01/2014 03:49 PM, Rob Lanphier wrote:

...
I'm eager to get some closure on the overall RFC about HTML templating myself. Am I right to assume that the process is:

Get Knockoff complete enough that we can fairly evaluate it against

the

...
other proposals 2. Reopen the conversation about various alternatives 3. Pick something

Yup, that's pretty much it.

Gabriel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

C. Scott Ananian

4:11 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

I think there's some room to experiment with various options as extensions, in beta, etc. The "pick something" stage is going to have to weigh a lot of competing objectives -- for UI building, for content templates, for scripting, visual editor support, etc. I don't see us getting to the "pick something" stage until a number of different implementations have been prototyped in different contexts. --scott

ps. Tyler: please play with the 'spacebars' implementation in https://www.meteor.com/ ; escaping can really be transparently correct if your implementation works with structured HTML, not just banging together raw strings.

Sumana Harihareswara

2 Apr 2 Apr

11:09 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

TL;DR: who's testing out the non-Knockout approaches?

Much thanks to everyone who corrected & enlightened me! I was especially grateful to Chris Steipp's explanation:

...

When looking at a typical web page, you need several escaping strategies. OWASP roughly groups them into html body, plain attributes, URL context, Javascript context, CSS context. My point was that you need several MakeWhateverSafe functions, and have to use them in the right context.

and Peter Kaminski's reminder of the "separation of concerns" concept, and C. Scott's explanation of how we could balance usability for human editors against easy integration with VE and executability http://www.gossamer-threads.com/lists/wiki/wikitech/445152#445152 . Thanks also to Nuria Ruiz and others for the corrections on security stuff!

Now that we know the mobile team is checking out the KnockoutJS idea in late April and reporting back to the list in late April/early May, I'm wondering about something C. Scott said:

On 04/01/2014 07:11 PM, C. Scott Ananian wrote:

...

I think there's some room to experiment with various options as extensions, in beta, etc. The "pick something" stage is going to have to weigh a lot of competing objectives -- for UI building, for content templates, for scripting, visual editor support, etc. I don't see us getting to the "pick something" stage until a number of different implementations have been prototyped in different contexts. --scott

That makes sense. So who is testing out/reporting about the other approaches or implementations? To repeat my question from March 18th:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this toolkit with one of the other template approaches, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? Should we try using this in another team?

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

C. Scott Ananian

12:10 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

@Nuria: you seem to have missed my point, which is that http://handlebarsjs.com/ is just one implementation of the design. Other implementations exist -- 28 of them are listed at http://mustache.github.io/ alone -- and some of these offer a more principled approach to escaping.

In particular, I keep mentioning the [spacebars] implementation, which is based on [htmljs] and is a structured template system, without the need for any explicit escaping. --scott

[spacebars]: http://meteorhacks.com/meteor-weekly-spacebars-tired-meteorite-autoupdate.ht... [htmljs]: https://github.com/meteor/meteor/tree/devel/packages/htmljs

C. Scott Ananian

12:17 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Apr 2, 2014 at 2:09 PM, Sumana Harihareswara sumanah@wikimedia.org wrote:

...

TL;DR: who's testing out the non-Knockout approaches?

I have a design in my head that separates code and presentation in a user-facing template system and is based on the Scribunto codebase. The idea would be to combine handlebar's very simple user-facing syntax with Scribunto's code editor for the non-presentational aspects. Every template would have a (possibly-empty) piece of code associated with it. Code is edited with a code editor, the presentation is edited with visual editor.

That said, this is strictly a weekend/airplane trip project for me at the moment, and I keep getting distracted by hacking on Scribunto/JS and v8js and node, all of which are (strictly speaking) not necessary for the basic idea, but are fun to hack on.

The last time I spent a chunk of time on this, I added basic handlebars support to gwicke's TAssembly prototype, which really is very very fast. --scott

-- (http://cscott.net)

Gabriel Wicke

1:05 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 04/02/2014 11:09 AM, Sumana Harihareswara wrote:

...

TL;DR: who's testing out the non-Knockout approaches?

Lots of teams have used a variety of template libraries, see the RFC [1]. I know that for example handlebars is used in a few teams right now.

Gabriel

[1]: https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library#...

S Page

7:19 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Wed, Apr 2, 2014 at 11:09 AM, Sumana Harihareswara <sumanah@wikimedia.org

...

wrote:

...

TL;DR: who's testing out the non-Knockout approaches?

Besides those listed at [1] The Flow discussion system needs to render templates on both the client and server[2]. The Flow team is going to use handlebars.js and its lightncandy PHP implementation; we wanted to try KnockOff/TAssembly but the timing isn't right. We will be ripping off :) MobileFrontend's integration of Hogan.js client-side templates.

(Gabriel Wicke wrote "I know that for example handlebars is used in a few teams right now." -- who else?)

...

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this toolkit with one of the other template approaches, or maybe this is enough by itself!

As I understand it, OOjs UI is more a rich widget library rather than a templating system. You would compose a page out of widgets that render what you want, and yes you could use OOjs UI with a templating engine (it operates on jQuery elements).

...

Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it?

The Multimedia team is using OOjs UI for the "About this file" dialog in the Media Viewer[3] (currently a beta feature). They haven't styled it to use Agora controls.

Mobile is using VisualEditor with the beginnings of an Agora theme.

Hope this helps, corrections welcome.

[1] https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library#... [2] https://www.mediawiki.org/wiki/Flow/Epic_Front-End#Templating [3] https://www.mediawiki.org/wiki/Multimedia/About_Media_Viewer

-- =S Page Features engineer on the Flow team

Sumana Harihareswara

13 May 13 May

1:42 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On 04/02/2014 10:19 PM, S Page wrote:

...

On Wed, Apr 2, 2014 at 11:09 AM, Sumana Harihareswara <sumanah@wikimedia.org

...
wrote:

...
TL;DR: who's testing out the non-Knockout approaches?

Besides those listed at [1] The Flow discussion system needs to render templates on both the client and server[2]. The Flow team is going to use handlebars.js and its lightncandy PHP implementation; we wanted to try KnockOff/TAssembly but the timing isn't right. We will be ripping off :) MobileFrontend's integration of Hogan.js client-side templates.

(Gabriel Wicke wrote "I know that for example handlebars is used in a few teams right now." -- who else?)

...
oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this toolkit with one of the other template approaches, or maybe this is enough by itself!

As I understand it, OOjs UI is more a rich widget library rather than a templating system. You would compose a page out of widgets that render what you want, and yes you could use OOjs UI with a templating engine (it operates on jQuery elements).

...
Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it?

The Multimedia team is using OOjs UI for the "About this file" dialog in the Media Viewer[3] (currently a beta feature). They haven't styled it to use Agora controls.

Mobile is using VisualEditor with the beginnings of an Agora theme.

Hope this helps, corrections welcome.

[1] https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library#... [2] https://www.mediawiki.org/wiki/Flow/Epic_Front-End#Templating [3] https://www.mediawiki.org/wiki/Multimedia/About_Media_Viewer

And Ryan Kaldari wrote on April 1:

...

The mobile web team will be evaluating Gabriel's KnockoutJS template implementation sometime between April 14 and April 28. The things we will be looking at include: how well it will work for mobile's current templating needs, how appropriate it is for mobile delivery, and how much effort would be involved in migrating our existing templates to it. We'll update the list with our findings then.

Do the Flow or Mobile teams have any updates on how well their experiments worked? Thanks!

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Ryan Kaldari

2:23 p.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

On Tue, May 13, 2014 at 1:42 PM, Sumana Harihareswara <sumanah@wikimedia.org

...

wrote:

...

And Ryan Kaldari wrote on April 1:

...
The mobile web team will be evaluating Gabriel's KnockoutJS template implementation sometime between April 14 and April 28. The things we will be looking at include: how well it will work for mobile's current templating needs, how appropriate it is for mobile delivery, and how much effort would be involved in migrating our existing templates to it. We'll update the list with our findings then.

Do the Flow or Mobile teams have any updates on how well their experiments worked? Thanks!

Unfortunately, that card got moved to the backlog due to time constraints and higher priorities, so we have not yet evaluated the KnockoutJS template implementation. We are continuing to use hogan/handlebars in the meantime.

Ryan Kaldari

Nuria Ruiz

26 Mar 26 Mar

3:36 a.m.

New subject: HTML templating systems & MediaWiki - is this summary right?

Sumana,

Sorry for my late reply but since you asked for corrections, here are a couple.

...

Mustache.js is a popular modern choice.

Not really, mustache has many lack-offs that prevent it from being a popular choice, among them the lack of a server side compiler and if/else constructs. Handlebars is a lot more popular. Also twitters flavor of a string based engine: http://twitter.github.io/hogan.js/

...

One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings.

This is not correct. String based systems escape the strings they are interpolating by default. Take a look at escaping of what is the most popular string-based engine, handlebars: https://github.com/wycats/handlebars.js/

On Wed, Mar 19, 2014 at 4:27 AM, Sumana Harihareswara <sumanah@wikimedia.org

...

wrote:

...

I'm trying to understand what our current situation is and what our choices are around HTML templating systems and MediaWiki, so I'm gonna note what I think I understand so far in this mail and then would love for people to correct me. TL;DR - did we already consense on a templating system and I just missed it?

Description: An HTML templates system (also known as a templating engine) lets you (the programmer) write something that looks more like a document than it looks like code, then has hooks/entry points/macro substitution points (for user input and whatnot) that then invoke code, then emits finished HTML for the browser to render.

Examples: PHP itself is kinda a templating language. In the PHP world, Smarty is a somewhat more mature/old-school choice. Mustache.js is a popular modern choice. And in other languages, you'd pick a lot of the MVC frameworks that are popular, e.g. Django or Jinja in Python.

Spectrum of approaches: One approach treats HTML as a string ("here's a bunch of bytes to interpolate"). From a security perspective, this is dangerously easy to have vulnerabilities in, because you just naively insert strings. Then on the other end of the spectrum, you have code that always keeps the document object model (DOM) in memory, so the programmer is abstractly manipulating that data model and passing around an object. Sure, it spits out HTML in the end, but inherent in the method for turning those objects into HTML is a sanitization step, so that's inherently more secure. There's some discussion at https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Templates . I presume we want the latter, but that the former model is more performant?

We talked about this stuff in https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-02-21 and

https://www.mediawiki.org/wiki/Talk:Architecture_Summit_2014/HTML_templating... . Based on that plus

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... it seems like we are supposed to get consensus on which system(s) to use, and we kind of have four things we could choose:

oojs - https://www.mediawiki.org/wiki/OOjs_UI -- could use this

toolkit with one of the template approaches below, or maybe this is enough by itself! Currently used inside VisualEditor and I am not sure whether any other MediaWiki extensions or teams are using it? This is a DOM-based templating system.

Template approaches which are competing?:

MVC framework - Wikia has written their own templating library that

Wikia uses (Nirvana). Owen Davis is talking about this tomorrow in the RFC review meeting. https://www.mediawiki.org/wiki/Requests_for_comment/MVC_framework

mustache.js stuff - Ryan Kaldari and Chris Steipp mentioned this I think?

Knockout-compatible implementation in Node.js & PHP

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... and

https://www.mediawiki.org/wiki/Requests_for_comment/HTML_templating_library/... , being worked on by Gabriel Wicke, Matt Walker, and others. DOM-based.

There's also an OutputPage refactor suggested in https://www.mediawiki.org/wiki/Requests_for_comment/OutputPage_refactor that's part of the HTML Templating RFC Cluster

https://www.mediawiki.org/wiki/Architecture_Summit_2014/RFC_clusters#HTML_te... .

I guess my biggest question right now is whether I have all the big moving parts right in my summary above. Thanks.

-- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

3877

Age (days ago)

3932

Last active (days ago)

wikitech-l@lists.wikimedia.org

37 comments

12 participants

tags (0)

participants (12)

C. Scott Ananian
Chris Steipp
Daniel Friesen
Dmitriy Sintsov
Gabriel Wicke
Nuria Ruiz
Peter Kaminski
Rob Lanphier
Ryan Kaldari
S Page
Sumana Harihareswara
Tyler Romeo