On Wed, Nov 11, 2015 at 8:15 PM, Gergo Tisza <gtisza(a)wikimedia.org> wrote:
On Tue, Nov 10, 2015 at 1:40 PM, C. Scott Ananian
<cananian(a)wikimedia.org>
wrote:
2. `{{#balance:inline}}` would only allow
inline (i.e. phrasing)
content
and generate an error if a
`<p>`/`<a>`/`<h*>`/`<table>`/`<tr>`/`<td>`/`
<th>`/`<li>` tag is seen in the
content.
Why is <a> in that list? It's not flow/block content model and filtering it
out would severely restrict the usefulness of inline templates.
That's a good point. The problem is nested <a> tags. So we can either ban
open <a> tags from the context, or ban <a> tags from the content. Or split
things and have {{#balance:link}} vs {{#balance:inline}} or something like
that. Feedback welcome! The details of HTML parsing are hairy, and I
wouldn't be surprised if we need slight tweaks to things when we actually
get to the point of implementation.
To be extra specific: if the context is:
<p>hello, there <a>friend <!-- template goes here --></p>
and the template is "foo <a>bar</a>", then HTML5 parsing will
produce:
<p>hello, there <a>friend foo </a><a>bar</a></p>
where the added closing </a> tag inside the template body prevents "simple
substitution" of the template contents.
- We just need to emit synthetic `</p></table></...>` tokens, the
tree
builder
will take care of closing a tag if necessary or else
discarding
the
token.
What if there are multiple levels of unclosed tags?
We basically emit enough unclosed tags to close anything which might be
open, and let the tidy phase discard any which are not applicable.
Off-hand, I think the only tag where nesting would be an issue would be
<table>. So I guess we'll need the Sanitizer to count the open <table>
tags so we can be sure to emit enough close tags. Tricky!
--scott
--
(
http://cscott.net)