On Wed, Nov 11, 2015 at 8:15 PM, Gergo Tisza gtisza@wikimedia.org wrote:
On Tue, Nov 10, 2015 at 1:40 PM, C. Scott Ananian cananian@wikimedia.org wrote:
- `{{#balance:inline}}` would only allow inline (i.e. phrasing)
content
and generate an error if a
`<p>`/`<a>`/`<h*>`/`<table>`/`<tr>`/`<td>`/`
<th>`/`<li>` tag is seen in the content.
Why is <a> in that list? It's not flow/block content model and filtering it out would severely restrict the usefulness of inline templates.
That's a good point. The problem is nested <a> tags. So we can either ban open <a> tags from the context, or ban <a> tags from the content. Or split things and have {{#balance:link}} vs {{#balance:inline}} or something like that. Feedback welcome! The details of HTML parsing are hairy, and I wouldn't be surprised if we need slight tweaks to things when we actually get to the point of implementation.
To be extra specific: if the context is:
<p>hello, there <a>friend <!-- template goes here --></p>
and the template is "foo <a>bar</a>", then HTML5 parsing will produce:
<p>hello, there <a>friend foo </a><a>bar</a></p>
where the added closing </a> tag inside the template body prevents "simple substitution" of the template contents.
- We just need to emit synthetic `</p></table></...>` tokens, the tree
builder will take care of closing a tag if necessary or else
discarding
the token.
What if there are multiple levels of unclosed tags?
We basically emit enough unclosed tags to close anything which might be open, and let the tidy phase discard any which are not applicable.
Off-hand, I think the only tag where nesting would be an issue would be <table>. So I guess we'll need the Sanitizer to count the open <table> tags so we can be sure to emit enough close tags. Tricky! --scott