1st wrote large response, while i was still in midst of reading
various responses, when i've seen last few responses which have
touched many important factors related to very large scale sites,
and that wikipedia want to keep geo-location & geo-feature based
separation(s), then removed various paragraph that are unnecessary.
Most users have no experience in such (large scale or geo-location
based implementations) areas, neither do i.
For (compared to very) small scale sites, i faced such problem(s) :
At my first few small-scale implementations, i did not pay attention
to rate-limiting techniques, then i realized its importance over time.
So before adding DNSSEC 1st, and then upgrade into DNSSEC+DANE in
2nd, ... Admin(s) for sites with large user-base, must work early
on DNS rate-limiting solutions, various approaches should be tested
and applied first. Some type of dynamic and elastic set of ranges
need to be implemented, based on previous data of daily or weekly
loads. And it has to be "adaptive" as well, when it comes to a
situation when same (or same set of) IP-address is sending multiple
DNSSEC queries, too frequently.
When more understanding and discussion are carried on that
(rate-limiting), and better techniques are found and/or developed
for very large scale site(s), then implementing large scale DNSSEC
will be easier, or else it will remain scary/fearful.
Since my implementations cases are much much much smaller, i was
able to do in different order, first added dnssec, then dane, then
applied rate-limiting techniques. When initially i've set much
lower ranges, then observed many initial queries were failing from
client sides, so had to increase that set of ranges. For small-scale
site(s), that worked out very fine.
In some areas (northern EU countries), DNSSEC has already reached
more than 60% usage, in those area's wikipedia servers should be
attempted first, for initial test cases, that should help (i guess).
It is very unfortunate, here(USA) we are much much behind in many
real human-friendly techniques/products/systems.
-- Bright Star.
Received from Faidon Liambotis, on 2013-08-17 3:47 AM:
On Fri, Aug 16, 2013 at 08:04:24PM -0400, Zack
Weinberg wrote:
Hi, I'm a grad student at CMU studying
network security in general
and censorship / surveillance resistance in particular. I also used
to work for Mozilla, some of you may remember me in that capacity. My
friend Sumana Harihareswara asked me to comment on Wikimedia's plans
for hardening the encyclopedia against state surveillance.
<snip>
First of all, thanks for your input. It's much appreciated. As I'm sure
Sumanah has already mentioned, all of our infrastructure is being
developed in the open using free software and we'd be also very happy to
accept contributions in code/infrastructure-as-code as well.
That being said, literally everything in your mail has been already
considered and discussed multiple times :), plus a few others you didn't
mention (GCM ciphers, OCSP stapling, SNI & split certificates,
short-lived certificates, ECDSA certificates). A few have been
discussed on wikitech, others are under internal discussion &
investigation by some of us with findings to be posted here too when we
have something concrete.
I don't mean this to sound rude, but I think you may be oversimplifying
the situation quite a bit.
Enabling HTTPS for everyone on a website our scale isn't a trivial thing
to do. Besides matters of policy -blocking Wikipedia in China isn't
something that can be lightly done- there are significant technical
restrictions. Just to lay a few examples here: there is no software that
can both do both SPDY and take as input the key for encrypting SSL
session tokens, something that's needed if you need a cluster of
load-balancers (you also need to rotate it periodically; a lot of people
miss this). There is no software out there than support both having a
shared SSL session cache and also do SPDY[1]. etc.
DANE is great and everything but there's no software availalble that
satisfies our GeoDNS requirements *and* supports DNSSEC. I know, I've
tried them all. Traditional DNSSEC signing proxies (e.g. OpenDNSSEC)
don't work at all with DNSSEC. (FWIW, we're switching to gdnsd which has
a unique set of characteristics and whose author Brandon Black was hired
in the ops team shortly after we decided to switch to gdnsd.)
Plus, DNSSEC has only a marginal adoption client-side (and DANE has none
yet) and has important side effects. For example, you're more likely to
be used as a source for DNS amplification attacks as your responses get
larger. More importantly though, you're breaking users, something that
needs to be carefully weighted with the benefits.
If you need numbers, this is from a paper from USENIX Security '13 last
week titled "Measuring the Practical Impact of DNSSEC Deployment"[2]:
"And we show, for the first time, that enabling DNSSEC measurably
increases end-to-end resolution failures. For every 10 clients that are
protected from DNS tampering when a domain deploys DNSSEC, approximately
one ordinary client (primarily in Asia) becomes unable to access the
domain."
Is dedicating (finite) engineering time to write the necessary code for
e.g. gdnsd to support DNSSEC, just to be able to support DANE for which
there's exactly ZERO browser support, while at the same time breaking a
significant chunk of users, a sensible thing to do?
We'll keep wikitech -and blog, where appropriate- up to date with our
plans as these evolve. In the meantime, feel free to dive in our puppet
repository and see our setup and make your suggestions :)
Best,
Faidon
(wmf ops)
[1]: stud does SSL well (but not SPDY), but does not pass
X-Forwarded-For, and Varnish doesn't support the PROXY protocol that
stud provides, so either of the two would need to be coded (and we've
already explored what it'd take to code it). nginx scales up and has
some support for SPDY but doesn't have a shared-across-systems session
cache or session token key rotation support nor it supports ECDSA.
Apache 2.4 has all that, but we're not sure of its performance
characteristics yet, plus mod-spdy won't cut it for us. etc.
[2]:
https://www.usenix.org/conference/usenixsecurity13/measuring-practical-impa…