On 10/29/2013 11:48 AM, Zack Weinberg wrote:
Theoretically speaking, the right way to do this would
be to identify
the (small, one hopes) number of *sources* of sensitive data and
change them to return objects of a special class, which would then
automatically print out as "[REDACTED]" (if so configured) in a stack
trace. This would have other benefits; for instance, the special class
could arrange to handle the data extra-carefully (scrubbing it from
memory when no longer required, doing constant-time comparisons, that
sort of thing) and code that needed to treat the datum as anything
other than an opaque blob would have to explicitly unwrap it, which
would then be a red flag for code review.
I don't agree with this. Whitelists are the preferred approach
theoretically, and there have been many cases where blacklists have
failed in practice. This is all the more true when the set of
possibilities is big (all MW functions).
Even if we get all the sensitive functions or data now (difficult), it
will probably not hold up for code in extensions, hooks, and just future
core changes where no one things of $wgRedactedFunctionArguments
Ori is right that blacklists are not safe here. I think traces that
show just method names (for public wikis) or (for private wikis, if a
config is explicitly set true) everything (no redaction) makes sense.
Matt Flaschen