On 8/16/13, Zack Weinberg zackw@cmu.edu wrote:
Hi, I'm a grad student at CMU studying network security in general and censorship / surveillance resistance in particular. I also used to work for Mozilla, some of you may remember me in that capacity. My friend Sumana Harihareswara asked me to comment on Wikimedia's plans for hardening the encyclopedia against state surveillance. I've read the discussion to date on this subject, but it was kinda all over the map, so I thought it would be better to start a new thread. Actually I'm going to start two threads, one for general site hardening and one specifically about traffic analysis. This is the one about site hardening, which should happen first. Please note that I am subscribed to wikitech-l but not wikimedia-l (but I have read the discussion over there).
The roadmap at https://blog.wikimedia.org/2013/08/01/future-https-wikimedia-projects/ looks to me to have the right shape, but there are some missing things and points of confusion.
The first step really must be to enable HTTPS unconditionally for everyone (whether or not logged in). I see on the roadmap that there is concern that this will lock out large groups of users, e.g. from China; a workaround simply *must* be found for this. Everything else that is worth doing is rendered ineffective if *any* application layer data is *ever* transmitted over an insecure channel. There is no point worrying about traffic analysis when an active man-in-the-middle can inject malicious JavaScript into unsecured pages, or a passive one can steal session cookies as they fly by in cleartext.
As part of the engineering effort to turn on TLS for everyone, you should also provide SPDY, or whatever they're calling it these days. It's valuable not only for traffic analysis' sake, but because it offers server-side efficiency gains that (in theory anyway) should mitigate the TLS overhead somewhat.
After that's done, there's a grab bag of additional security refinements that are deployable immediately or with minimal-to-moderate engineering effort. The roadmap mentions HTTP Strict Transport Security; that should definitely happen. All cookies should be tagged both Secure and HttpOnly (which renders them inaccessible to accidental HTTP loads and to page JavaScript); now would also be a good time to prune your cookie requirements, ideally to just one which does not reveal via inspection whether or not someone is logged in. You should also do Content-Security-Policy, as strict as possible. I know this can be a huge amount of development effort, but the benefits are equally huge - we don't know exactly how it was done, but there's an excellent chance CSP on the hidden service would have prevented the exploit discussed here: https://blog.torproject.org/blog/hidden-services-current-events-and-freedom-...
Several people raised concerns about Wikimedia's certificate authority becoming compromised (whether by traditional "hacking", social engineering, or government coercion). The best available cure for this is called "certificate pinning", which is unfortunately only doable by talking to browser vendors right now; however, I imagine they would be happy to apply pins for Wikipedia. There's been some discussion of an HSTS extension that would apply a pin (http://tools.ietf.org/html/draft-evans-palmer-key-pinning-00) and it's also theoretically doable via DANE (http://tools.ietf.org/html/rfc6698); however, AFAIK no one implements either of these things yet, and I rate it moderately likely that DANE is broken-as-specified. DANE requires DNSSEC, which is worth implementing for its own sake (it appears that the wikipedia.org. and wikimedia.org. zones are not currently signed).
Perfect forward secrecy should also be considered at this stage. Folks seem to be confused about what PFS is good for. It is *complementary* to traffic analysis resistance, but it's not useless in the absence of. What it does is provide defense in depth against a server compromise by a well-heeled entity who has been logging traffic *contents*. If you don't have PFS and the server is compromised, *all* traffic going back potentially for years is decryptable, including cleartext passwords and other equally valuable info. If you do have PFS, the exposure is limited to the session rollover interval. Browsers are fairly aggressively moving away from non-PFS ciphersuites (see https://briansmith.org/browser-ciphersuites-01.html; all of the non-"deprecated" suites are PFS).
Finally, consider paring back the set of ciphersuites accepted by your servers. Hopefully we will soon be able to ditch TLS 1.0 entirely (all of its ciphersuites have at least one serious flaw). Again, see https://briansmith.org/browser-ciphersuites-01.html for the current thinking from the browser side.
zw
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks for taking the time to write these two emails. You raise an interesting point about having everything on one domain. I really don't think that's practical for political reasons (not to mention technical disruption), but it would allow people to be more lost in the crowd, especially for small languages. Some of the discussion about this stuff has taken place on bugzilla. Have you read through https://bugzilla.wikimedia.org/show_bug.cgi?id=47832 ?
Personally I think we need to make a more formal list of who all the potential threats we could face are, and then expand that list to include what we would need to do to protect ourselves from the different types of threats (or which threats we chose not to care about). Some kid who downloads a firesheep-type program is very different type of threat then that of a state agent, and a state agent that is just trying to do broad spying is different from a state agent targeting a specific user. Lots of these discussion seem to end up being: lets do everything to try to protect against everything, which I don't think is the right mindset, as you can't protect against everything, and if you don't know what specifically you are trying to protect against, you end up missing things.
Tyler said:
With that said, I think the real problem with Wikimedia's security right now is a pretty big failure on the part of the operations team to inform anybody as to what the hell is going on. Why hasn't the TLS cipher list been updated? Why are we still using RC4 even though it's obviously a terrible option? Why isn't Wikimedia using DNSSEC, let alone DANE? I'm sure the operations team is doing quite a bit of work, but all of these things should be trivial configuration changes, and if they aren't, the community should be told why so we know why these changes haven't been applied yet.
While I would certainly love to have more details on the trials and tribulations of the ops team, I think its a little unfair to consider deploying DNSSEC a "trivial config change". Changing things like our dns infrastructure is not something (or at least I imagine its not something) done on the spur of the moment. It needs testing and to be done cautiously, since if something bad happens the entire site goes down. That said, it would be nice for ops to comment on what their thoughts on DNSSEC (for reference bug 24413)
--bawolff