The 1.22 release adds a new configuration variable, $wgRedactedFunctionArguments, that allows administrators of MediaWiki installations to designate function arguments that should be redacted from stack traces. For example:
$wgRedactedFunctionArguments[] = array('User::comparePasswords' => array( 0, 1 ) )
This value specifies that whenever the method User::comparePasswords appears in an exception stack trace, the value of the first and second arguments are to be redacted, meaning the string 'REDACTED' is inserted where the argument value would otherwise appear. The case for this feature was made in https://bugzilla.wikimedia.org/show_bug.cgi?id=30714.
I think that it is a design blunder, and that we should remove it from MediaWiki. The task of making $wgRedactedFunctionArguments comprehensive is hopelessly gargantuan. It would require something like a full trace of the flow of data throughout all of MediaWiki and its extensions.
The best case scenario is that the feature is not used. The worst case -- which seems to be materializing, judging by the default value assigned to it in DefaultSettings.php -- is that it is used with a short and chronically out-of-date list of obvious suspects, and that the resultant traces leak private data. This is actually worse than making no attempt at all to sanitize traces, because it is liable to mislead and inspire false feelings of confidence.
I think that the proper way to handle low-level operational data like stack traces is to make it clear that it is liable to contain sensitive information, and to make no pretense at all of sanitizing it. If there is a credible need for outputting redacted traces, the output should exclude *all* values, not just the ones that someone remembered to configure.
---
"Base access decisions on permission rather than exclusion. This principle...means that the default situation is lack of access, and the protection scheme identifies conditions under which access is permitted. The alternative, in which mechanisms attempt to identify conditions under which access should be refused, presents the wrong psychological base for secure system design. A conservative design must be based on arguments why objects should be accessible, rather than why they should not. In a large system some objects will be inadequately considered, so a default of lack of permission is safer. A design or implementation mistake in a mechanism that gives explicit permission tends to fail by refusing permission, a safe situation, since it will be quickly detected. On the other hand, a design or implementation mistake in a mechanism that explicitly excludes access tends to fail by allowing access, a failure which may go unnoticed in normal use. This principle applies both to the outward appearance of the protection mechanism and to its underlying implementation."
("The Protection of Information in Computer Systems", http://web.mit.edu/Saltzer/www/publications/protection/Basic.html) --- Ori Livneh ori@wikimedia.org
On 2013-10-28 1:43 AM, "Ori Livneh" ori@wikimedia.org wrote:
The 1.22 release adds a new configuration variable, $wgRedactedFunctionArguments, that allows administrators of MediaWiki installations to designate function arguments that should be redacted from stack traces. For example:
$wgRedactedFunctionArguments[] = array('User::comparePasswords' => array( 0, 1 ) )
This value specifies that whenever the method User::comparePasswords appears in an exception stack trace, the value of the first and second arguments are to be redacted, meaning the string 'REDACTED' is inserted where the argument value would otherwise appear. The case for this feature was made in https://bugzilla.wikimedia.org/show_bug.cgi?id=30714.
I think that it is a design blunder, and that we should remove it from MediaWiki. The task of making $wgRedactedFunctionArguments comprehensive is hopelessly gargantuan. It would require something like a full trace of the flow of data throughout all of MediaWiki and its extensions.
The best case scenario is that the feature is not used. The worst case -- which seems to be materializing, judging by the default value assigned to it in DefaultSettings.php -- is that it is used with a short and chronically out-of-date list of obvious suspects, and that the resultant traces leak private data. This is actually worse than making no attempt at all to sanitize traces, because it is liable to mislead and inspire false feelings of confidence.
I think that the proper way to handle low-level operational data like stack traces is to make it clear that it is liable to contain sensitive information, and to make no pretense at all of sanitizing it. If there is a credible need for outputting redacted traces, the output should exclude *all* values, not just the ones that someone remembered to configure.
"Base access decisions on permission rather than exclusion. This principle...means that the default situation is lack of access, and the protection scheme identifies conditions under which access is permitted. The alternative, in which mechanisms attempt to identify conditions under which access should be refused, presents the wrong psychological base for secure system design. A conservative design must be based on arguments why objects should be accessible, rather than why they should not. In a large system some objects will be inadequately considered, so a default of lack of permission is safer. A design or implementation mistake in a mechanism that gives explicit permission tends to fail by refusing permission, a safe situation, since it will be quickly detected. On the other hand, a design or implementation mistake in a mechanism that explicitly excludes access tends to fail by allowing access, a failure which may go unnoticed in normal use. This principle applies both to the outward appearance of the protection mechanism and to its underlying implementation."
("The Protection of Information in Computer Systems", http://web.mit.edu/Saltzer/www/publications/protection/Basic.html)
Ori Livneh ori@wikimedia.org
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I would really like to see stack traces enabled on wmf wikis again. Really really like to see. Seriously, i'll give a unicorn to anyone who makes that happen.
However If that's the goal, it would probably be safer to just totally disable method arguments in the stack trace for the web output only leaving the method names. Imo that would still give enough useful info (a good percentage of the time the exception message in itself is enough info to figure out what's going on or at least triage bug reports into dupes/non-dupes)
That said, redacting might still make sense if we don't give any indication its fully safe. Lots of third party wikis have exceptions enabled in web output regardless. Might as well blacklist the bad things we know as they are going to have it enabled anyhow.
-bawolff
On Mon, 28 Oct 2013 12:00:23 +0100, Brian Wolff bawolff@gmail.com wrote:
I would really like to see stack traces enabled on wmf wikis again. Really really like to see. Seriously, i'll give a unicorn to anyone who makes that happen.
I'm with Brian here. The blacklist approach is not ideal, design-wise; but the unsafe arguments *should* be limited to a very few functions, and if we have to give up some design ideals for stacktraces, so be it.
On Mon, Oct 28, 2013 at 4:00 AM, Brian Wolff bawolff@gmail.com wrote: [snip]
it would probably be safer to just totally disable method arguments in the stack trace for the web output only leaving the method names. Imo that would still give enough useful info (a good percentage of the time the exception message in itself is enough info to figure out what's going on or at least triage bug reports into dupes/non-dupes)
I would love to see stack traces without parameters -- as far as I know, "just the function names" is all you get from Java etc, so it's hardly without precedent to do so, and it neatly solves the "omg private info" problem.
-- brion
On Mon, Oct 28, 2013 at 12:42 AM, Ori Livneh ori@wikimedia.org wrote:
I think that the proper way to handle low-level operational data like stack traces is to make it clear that it is liable to contain sensitive information, and to make no pretense at all of sanitizing it.
I don't think the idea here was to ever make the stack traces *safe*, just to redact the most obvious things to reduce the risk if someone carelessly posts a stack trace publicly.
Personally, I think the "Java model" as exemplified in https://gerrit.wikimedia.org/r/#/c/92334/ PS3 goes too far in the other direction. In this case, an option to log unredacted traces that I could enable on my local test wiki would be useful.
I don't think the idea here was to ever make the stack traces *safe*, just to redact the most obvious things to reduce the risk if someone carelessly posts a stack trace publicly.
Personally, I think the "Java model" as exemplified in https://gerrit.wikimedia.org/r/#/c/92334/ PS3 goes too far in the other direction. In this case, an option to log unredacted traces that I could enable on my local test wiki would be useful.
I think Ori's original point stands though. Configuration could be used to redact fully / not redact at all for local debugging purposes. But a black list for what to redact is bad for all the reasons black lists are bad security in general.
On Tue, Oct 29, 2013 at 9:55 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
I think Ori's original point stands though. Configuration could be used to redact fully / not redact at all for local debugging purposes. But a black list for what to redact is bad for all the reasons black lists are bad security in general.
Theoretically speaking, the right way to do this would be to identify the (small, one hopes) number of *sources* of sensitive data and change them to return objects of a special class, which would then automatically print out as "[REDACTED]" (if so configured) in a stack trace. This would have other benefits; for instance, the special class could arrange to handle the data extra-carefully (scrubbing it from memory when no longer required, doing constant-time comparisons, that sort of thing) and code that needed to treat the datum as anything other than an opaque blob would have to explicitly unwrap it, which would then be a red flag for code review.
I don't have any idea how hard this would be; I'd guess somewhere between "conceptually easy but a huge number of tedious almost-but-not-quite-mechanical changes to implement" and "infeasible due to API breakage".
zw
On 10/29/2013 11:48 AM, Zack Weinberg wrote:
Theoretically speaking, the right way to do this would be to identify the (small, one hopes) number of *sources* of sensitive data and change them to return objects of a special class, which would then automatically print out as "[REDACTED]" (if so configured) in a stack trace. This would have other benefits; for instance, the special class could arrange to handle the data extra-carefully (scrubbing it from memory when no longer required, doing constant-time comparisons, that sort of thing) and code that needed to treat the datum as anything other than an opaque blob would have to explicitly unwrap it, which would then be a red flag for code review.
I don't agree with this. Whitelists are the preferred approach theoretically, and there have been many cases where blacklists have failed in practice. This is all the more true when the set of possibilities is big (all MW functions).
Even if we get all the sensitive functions or data now (difficult), it will probably not hold up for code in extensions, hooks, and just future core changes where no one things of $wgRedactedFunctionArguments
Ori is right that blacklists are not safe here. I think traces that show just method names (for public wikis) or (for private wikis, if a config is explicitly set true) everything (no redaction) makes sense.
Matt Flaschen
On Tue, Oct 29, 2013 at 6:55 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
I don't think the idea here was to ever make the stack traces *safe*, just to redact the most obvious things to reduce the risk if someone carelessly posts a stack trace publicly.
Personally, I think the "Java model" as exemplified in https://gerrit.wikimedia.org/r/#/c/92334/ PS3 goes too far in the other direction. In this case, an option to log unredacted traces that I could enable on my local test wiki would be useful.
I think Ori's original point stands though. Configuration could be used to redact fully / not redact at all for local debugging purposes. But a black list for what to redact is bad for all the reasons black lists are bad security in general.
I think the approach we are converging on is this:
- Always redact all argument values for user-facing backtraces - Never redact any argument values for wfDebugLog()'d backtraces - Redact arguments by replacing each argument with the name of its class (if object) or type (if primitive).
The redacted traces look like this:
#0 /vagrant/mediawiki/extensions/Vector/Vector.hooks.php(82): functionThatFails(OutputPage) #1 [internal function]: VectorHooks::beforePageDisplay(string, string) #2 /vagrant/mediawiki/includes/Hooks.php(199): call_user_func_array(string, array) #3 /vagrant/mediawiki/includes/GlobalFunctions.php(3877): Hooks::run(string, array) #4 /vagrant/mediawiki/includes/OutputPage.php(2075): wfRunHooks(string, array) #5 /vagrant/mediawiki/includes/Wiki.php(610): OutputPage->output() #6 /vagrant/mediawiki/includes/Wiki.php(467): MediaWiki->main() #7 /vagrant/mediawiki/index.php(49): MediaWiki->run() #8 {main}
wikitech-l@lists.wikimedia.org