Hi all! This is an announcement for a new developer feature in MediaWiki. If you don’t develop MediaWiki core, extensions or skins, you can stop reading :)
MediaWiki interface messages are generally “safe” to edit: when they contain markup, it is either parsed (as wikitext), sanitized, or fully HTML-escaped. For this reason, administrators are allowed to edit normal messages on-wiki in the MediaWiki: namespace, while editing JS code (which is more dangerous) is restricted to interface administrators. (A few exceptions, messages that are not escaped and which can only be edited by interface administrators, are configured in $wgRawHtmlMessages https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:$wgRawHtmlMessages.) Occasionally, a bug in the software means that a message isn’t properly escaped, which can in theory be abused by administrators to effectively gain interface administrator powers (by editing a MediaWiki: page for a message to contain <script> tags, or onclick="" attributes, or whatever). Such bugs are usually considered low-severity security issues; some of them are tracked in T2212 https://phabricator.wikimedia.org/T2212. (The general issue is known as cross-site scripting https://www.mediawiki.org/wiki/Special:MyLanguage/Cross-site_scripting and can be much more severe when it’s not limited to interface messages.)
Previously, checking for these issues as a developer was tedious: if you suspected that a message was vulnerable to HTML injection, you had to create a page for it in the MediaWiki: namespace, or edit the corresponding en.json file on disk (and potentially rebuild the localisation cache). The recently merged “xss language code” feature simplifies this process. If the developer setting $wgUseXssLanguage is set to true, then an “x-xss” language code becomes available and can be selected with *?uselang=x-xss* in the URL. When using this language code, all messages become “malicious”: every message is replaced by a snippet of HTML that tries to run alert(' *message-key*'). If everything is implemented correctly, all of those HTML snippets should be escaped, and no alerts should fire, although the wiki will look quite strange:
If you see any alert, then that means that a message has not been escaped correctly; use the message key shown in the alert to hunt down the buggy code (or add the message key to $wgRawHtmlMessages). This feature is intended to be especially useful during code review: check out the change, load a page with ?uselang=x-xss, and see if any alerts come up.
Miscellaneous notes:
- This is a developer-only feature. I strongly recommend against setting $wgUseXssLanguage = true; in any production setting. (It will be added to DevelopmentSettings.php soon.) - Above, I focused on the possibility to abuse unescaped messages via the MediaWiki: namespace. You might also be thinking about the potential for translatewiki.net contributors to inject malicious HTML into message translations; however, the translation exports from translatewiki.net to the JSON files automatically check for any HTML in translations, and flag suspicious cases for human review. Therefore, it’s much harder to exploit an unescaped message via translatewiki.net than via the MediaWiki: namespace. - Finally, I should mention that we already found several vulnerabilities using this feature, which will be fixed with the upcoming security release https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/QTRFMDRQAL7QK4RN53URX5YBBV744AWI/. If you try out this feature now, and find a vulnerable message, I suggest you wait until then, and check whether it’s still vulnerable, before reporting it.
Cheers, Lucas
This is clearly yielding some interesting results.
One of the patterns i've noticed is that several of the examples seem to involve mustache templates. I think there are two reasons for this:
* mustache templates cannot currently be checked by phan-taint-check * Because they are a separate file, the escaping is now fairly far away from the context where the variable is used. Its easy to lose track of if a specific variable is supposed to be escaped between the template file and the call into TemplateProcessor.
Anyways, i'd like to propose a naming convention. Any mustache variable that is used as raw html should have some sort of easily identifiable prefix so it is easy to keep track of which parameters are escaped and which are not. e.g. instead of naming the parameter foo, it would be named something like HTMLFoo.
Thoughts? -- Brian
On Thu, Sep 28, 2023 at 9:01 AM Lucas Werkmeister < lucas.werkmeister@wikimedia.de> wrote:
Hi all! This is an announcement for a new developer feature in MediaWiki. If you don’t develop MediaWiki core, extensions or skins, you can stop reading :)
MediaWiki interface messages are generally “safe” to edit: when they contain markup, it is either parsed (as wikitext), sanitized, or fully HTML-escaped. For this reason, administrators are allowed to edit normal messages on-wiki in the MediaWiki: namespace, while editing JS code (which is more dangerous) is restricted to interface administrators. (A few exceptions, messages that are not escaped and which can only be edited by interface administrators, are configured in $wgRawHtmlMessages https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:$wgRawHtmlMessages.) Occasionally, a bug in the software means that a message isn’t properly escaped, which can in theory be abused by administrators to effectively gain interface administrator powers (by editing a MediaWiki: page for a message to contain <script> tags, or onclick="" attributes, or whatever). Such bugs are usually considered low-severity security issues; some of them are tracked in T2212 https://phabricator.wikimedia.org/T2212. (The general issue is known as cross-site scripting https://www.mediawiki.org/wiki/Special:MyLanguage/Cross-site_scripting and can be much more severe when it’s not limited to interface messages.)
Previously, checking for these issues as a developer was tedious: if you suspected that a message was vulnerable to HTML injection, you had to create a page for it in the MediaWiki: namespace, or edit the corresponding en.json file on disk (and potentially rebuild the localisation cache). The recently merged “xss language code” feature simplifies this process. If the developer setting $wgUseXssLanguage is set to true, then an “x-xss” language code becomes available and can be selected with *?uselang=x-xss* in the URL. When using this language code, all messages become “malicious”: every message is replaced by a snippet of HTML that tries to run alert(' *message-key*'). If everything is implemented correctly, all of those HTML snippets should be escaped, and no alerts should fire, although the wiki will look quite strange:
If you see any alert, then that means that a message has not been escaped correctly; use the message key shown in the alert to hunt down the buggy code (or add the message key to $wgRawHtmlMessages). This feature is intended to be especially useful during code review: check out the change, load a page with ?uselang=x-xss, and see if any alerts come up.
Miscellaneous notes:
- This is a developer-only feature. I strongly recommend against
setting $wgUseXssLanguage = true; in any production setting. (It will be added to DevelopmentSettings.php soon.)
- Above, I focused on the possibility to abuse unescaped messages via
the MediaWiki: namespace. You might also be thinking about the potential for translatewiki.net contributors to inject malicious HTML into message translations; however, the translation exports from translatewiki.net to the JSON files automatically check for any HTML in translations, and flag suspicious cases for human review. Therefore, it’s much harder to exploit an unescaped message via translatewiki.net than via the MediaWiki: namespace.
- Finally, I should mention that we already found several
vulnerabilities using this feature, which will be fixed with the upcoming security release https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/QTRFMDRQAL7QK4RN53URX5YBBV744AWI/. If you try out this feature now, and find a vulnerable message, I suggest you wait until then, and check whether it’s still vulnerable, before reporting it.
Cheers, Lucas
-- Lucas Werkmeister (he/er) Software Engineer
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0)30-577 11 62-0 https://wikimedia.de
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us to achieve our vision! https://spenden.wikimedia.de
Wikimedia Deutschland — Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Charlottenburg, VR 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207. _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
On 2023-09-29 19:55, bawolff wrote:
This is clearly yielding some interesting results.
One of the patterns i've noticed is that several of the examples seem to involve mustache templates. I think there are two reasons for this:
- mustache templates cannot currently be checked by phan-taint-check
- Because they are a separate file, the escaping is now fairly far away
from the context where the variable is used. Its easy to lose track of if a specific variable is supposed to be escaped between the template file and the call into TemplateProcessor.
Let's not go too easy on Mustache, there are several more reasons why these templates are full of security gaps:
* Escaping or failing to escape HTML is the difference between {{ }} and {{{ }}}, and unless you spent your whole life writing Mustache templates, you won't remember which is which.
* Mustache has no concept of HTML structure, or any structure, or variable types; it just concatenates strings, so it's difficult to automatically detect any problems.
Anyways, i'd like to propose a naming convention. Any mustache variable that is used as raw html should have some sort of easily identifiable prefix so it is easy to keep track of which parameters are escaped and which are not. e.g. instead of naming the parameter foo, it would be named something like HTMLFoo.
We already do this, at least! Most Mustache variables used as raw HTML are prefixed with 'html-'. Vector is pretty consistent about this [1], but even it has some exceptions. Other code is not all so good.
[1] https://codesearch.wmcloud.org/search/?q=%7B%7B%7B&files=%5C.mustache%24...
I’d also like to discourage the Mustache “.” feature (“current context”, as in {{#html-items}}{{{.}}}{{/html-items}}), at least in unescaped HTML (i.e. {{{.}}}) but perhaps also in escaped HTML ({{.}}) – it made one of the related issues much harder to debug for me, because I couldn’t even find the template that was using the unescaped variable. (Admittedly, part of this was just because I didn’t know this feature existed.)
Am Fr., 29. Sept. 2023 um 21:55 Uhr schrieb Bartosz Dziewoński < matma.rex@gmail.com>:
On 2023-09-29 19:55, bawolff wrote:
This is clearly yielding some interesting results.
One of the patterns i've noticed is that several of the examples seem to involve mustache templates. I think there are two reasons for this:
- mustache templates cannot currently be checked by phan-taint-check
- Because they are a separate file, the escaping is now fairly far away
from the context where the variable is used. Its easy to lose track of if a specific variable is supposed to be escaped between the template file and the call into TemplateProcessor.
Let's not go too easy on Mustache, there are several more reasons why these templates are full of security gaps:
- Escaping or failing to escape HTML is the difference between {{ }} and
{{{ }}}, and unless you spent your whole life writing Mustache templates, you won't remember which is which.
- Mustache has no concept of HTML structure, or any structure, or
variable types; it just concatenates strings, so it's difficult to automatically detect any problems.
Anyways, i'd like to propose a naming convention. Any mustache variable that is used as raw html should have some sort of easily identifiable prefix so it is easy to keep track of which parameters are escaped and which are not. e.g. instead of naming the parameter foo, it would be named something like HTMLFoo.
We already do this, at least! Most Mustache variables used as raw HTML are prefixed with 'html-'. Vector is pretty consistent about this [1], but even it has some exceptions. Other code is not all so good.
[1]
https://codesearch.wmcloud.org/search/?q=%7B%7B%7B&files=%5C.mustache%24... https://codesearch.wmcloud.org/search/?q=%7B%7B%7B&files=%5C.mustache%24&excludeFiles=&repos=Skin%3AVector
-- Bartosz Dziewoński _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
wikitech-l@lists.wikimedia.org