Hi, I'm a grad student at CMU studying network security in general and censorship / surveillance resistance in particular. I also used to work for Mozilla, some of you may remember me in that capacity. My friend Sumana Harihareswara asked me to comment on Wikimedia's plans for hardening the encyclopedia against state surveillance. I've read the discussion to date on this subject, but it was kinda all over the map, so I thought it would be better to start a new thread. Actually I'm going to start two threads, one for general site hardening and one specifically about traffic analysis. This is the one about site hardening, which should happen first. Please note that I am subscribed to wikitech-l but not wikimedia-l (but I have read the discussion over there).
The roadmap at https://blog.wikimedia.org/2013/08/01/future-https-wikimedia-projects/ looks to me to have the right shape, but there are some missing things and points of confusion.
The first step really must be to enable HTTPS unconditionally for everyone (whether or not logged in). I see on the roadmap that there is concern that this will lock out large groups of users, e.g. from China; a workaround simply *must* be found for this. Everything else that is worth doing is rendered ineffective if *any* application layer data is *ever* transmitted over an insecure channel. There is no point worrying about traffic analysis when an active man-in-the-middle can inject malicious JavaScript into unsecured pages, or a passive one can steal session cookies as they fly by in cleartext.
As part of the engineering effort to turn on TLS for everyone, you should also provide SPDY, or whatever they're calling it these days. It's valuable not only for traffic analysis' sake, but because it offers server-side efficiency gains that (in theory anyway) should mitigate the TLS overhead somewhat.
After that's done, there's a grab bag of additional security refinements that are deployable immediately or with minimal-to-moderate engineering effort. The roadmap mentions HTTP Strict Transport Security; that should definitely happen. All cookies should be tagged both Secure and HttpOnly (which renders them inaccessible to accidental HTTP loads and to page JavaScript); now would also be a good time to prune your cookie requirements, ideally to just one which does not reveal via inspection whether or not someone is logged in. You should also do Content-Security-Policy, as strict as possible. I know this can be a huge amount of development effort, but the benefits are equally huge - we don't know exactly how it was done, but there's an excellent chance CSP on the hidden service would have prevented the exploit discussed here: https://blog.torproject.org/blog/hidden-services-current-events-and-freedom-...
Several people raised concerns about Wikimedia's certificate authority becoming compromised (whether by traditional "hacking", social engineering, or government coercion). The best available cure for this is called "certificate pinning", which is unfortunately only doable by talking to browser vendors right now; however, I imagine they would be happy to apply pins for Wikipedia. There's been some discussion of an HSTS extension that would apply a pin (http://tools.ietf.org/html/draft-evans-palmer-key-pinning-00) and it's also theoretically doable via DANE (http://tools.ietf.org/html/rfc6698); however, AFAIK no one implements either of these things yet, and I rate it moderately likely that DANE is broken-as-specified. DANE requires DNSSEC, which is worth implementing for its own sake (it appears that the wikipedia.org. and wikimedia.org. zones are not currently signed).
Perfect forward secrecy should also be considered at this stage. Folks seem to be confused about what PFS is good for. It is *complementary* to traffic analysis resistance, but it's not useless in the absence of. What it does is provide defense in depth against a server compromise by a well-heeled entity who has been logging traffic *contents*. If you don't have PFS and the server is compromised, *all* traffic going back potentially for years is decryptable, including cleartext passwords and other equally valuable info. If you do have PFS, the exposure is limited to the session rollover interval. Browsers are fairly aggressively moving away from non-PFS ciphersuites (see https://briansmith.org/browser-ciphersuites-01.html; all of the non-"deprecated" suites are PFS).
Finally, consider paring back the set of ciphersuites accepted by your servers. Hopefully we will soon be able to ditch TLS 1.0 entirely (all of its ciphersuites have at least one serious flaw). Again, see https://briansmith.org/browser-ciphersuites-01.html for the current thinking from the browser side.
zw