Occasionally its useful to pass trusted data to javascript using data attributes on elements that you know is not from the user. In the past, there has been security issues from using the data attribute for information that is assumed to be trusted, but in reality could be messed with by the user. (T58699, T105413)
We already reserve data-ooui (by reserve, I mean blacklist in the sanitizer). But it feels wrong to use that for parts of mw that are not ooui. I would like to propose that we reserve data-mw- prefix as well for general usage by mediawiki/extensions (By which I mean that any attribute starting with data-mw-, would be blocked by the sanitizer. Thus if a user writes on a wikipage <span data-mw-foo="bar"></span>, the data-mw-foo attribute would be stripped). Thus if javascript sees such an attribute, it can know for sure that the value is not direct untrusted user-input.
Anyone have any objections to doing this?
Bikeshed now about the choice of name for the prefix, or forever hold your peace ;)
-- -bawolff
On 11/02/2015 05:11 AM, Brian Wolff wrote:
We already reserve data-ooui (by reserve, I mean blacklist in the sanitizer). But it feels wrong to use that for parts of mw that are not ooui. I would like to propose that we reserve data-mw- prefix as well for general usage by mediawiki/extensions (By which I mean that any attribute starting with data-mw-, would be blocked by the sanitizer. Thus if a user writes on a wikipage <span data-mw-foo="bar"></span>, the data-mw-foo attribute would be stripped). Thus if javascript sees such an attribute, it can know for sure that the value is not direct untrusted user-input.
Parsoid currently generates data-parsoid and data-mw attributes. data-mw attribute is used to convey semantic information in the HTML (images, templates, extensions) as per https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec
So, "data-mw" itself as an attribute should not be used by extensions, but, "data-mw-" as a prefix should be fine.
If it encounters data-mw or data-parsoid in wikitext HTML tags, it renames them with a data-x- prefix (as below).
[subbu@earth tests] echo "<span data-mw='foo' data-parsoid='bar'>x</span>" | node parse --normalize <p><span data-x-data-mw="foo" data-x-data-parsoid="bar">x</span></p>
That is another possibility for dealing with conflicting attributes, but not sure what purpose that might serve. For Parsoid, we have to do this so that we preserve the original wikitext unchanged. So, blacklisting seems fine in this case. Whenever that is done, we should consider adding data-parsoid and data-mw to that list of blacklisted attributes so we can get rid of the attribute-renaming code in Parsoid.
Subbu.
Anyone have any objections to doing this?
Bikeshed now about the choice of name for the prefix, or forever hold your peace ;)
-- -bawolff
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org