On 10/29/2013 11:48 AM, Zack Weinberg wrote:
Theoretically speaking, the right way to do this would be to identify the (small, one hopes) number of *sources* of sensitive data and change them to return objects of a special class, which would then automatically print out as "[REDACTED]" (if so configured) in a stack trace. This would have other benefits; for instance, the special class could arrange to handle the data extra-carefully (scrubbing it from memory when no longer required, doing constant-time comparisons, that sort of thing) and code that needed to treat the datum as anything other than an opaque blob would have to explicitly unwrap it, which would then be a red flag for code review.
I don't agree with this. Whitelists are the preferred approach theoretically, and there have been many cases where blacklists have failed in practice. This is all the more true when the set of possibilities is big (all MW functions).
Even if we get all the sensitive functions or data now (difficult), it will probably not hold up for code in extensions, hooks, and just future core changes where no one things of $wgRedactedFunctionArguments
Ori is right that blacklists are not safe here. I think traces that show just method names (for public wikis) or (for private wikis, if a config is explicitly set true) everything (no redaction) makes sense.
Matt Flaschen