Re: [Wikitech-l] Wikimedia's anti-surveillance plans: traffic analysis resistance

17 Aug 2013


      On Fri, Aug 16, 2013 at 8:04 PM, Zack Weinberg zackw@cmu.edu wrote:
...
Wikipedia user handle.  I realize how disruptive this would be, but I
think we need to consider changing the canonical Wikipedia URL format to
https://wikipedia.org/**LANGUAGE/PAGENAMEhttps://wikipedia.org/LANGUAGE/PAGENAME
.
Note that LANGUAGE.wikipedia.org/VARIANT/PAGENAME is already in use for
wikis which use the language variant conversion code, such as zhwiki.
 Usually LANGUAGE is a prefix of VARIANT, for example zh-hans, zh-hant,
en-us, en-gb, sr, sr-ec.
If you wanted to approach this goal, one could start by creating a proxy
service at https://secure.wikipedia.org/LANGUAGE-VARIANT/PAGENAME that did
an internal proxy of pages from
https://LANGUAGE.wikipedia.org/LANGUAGE-VARIANT/PAGENAME.  That would allow
some low-risk being-bold exploration of the different implications.
...
This last article raises a critical point. To render Wikipedia genuinely
secure against traffic analysis
Whenever someone seems to veer into discussion of absolute security I get
nervous.  It would be best to begin with asking "how can we make attacks
more expensive"?
Given the contents of the most recent NSA document leaks, it seems like it
is also worthwhile to attempt to confound the "are we at least 51% certain
that this user is not an American" question.  It does seem like combining
wikis is a worthwhile step here. I wonder if any arbitrary user of zhwiki
(for example) would automatically be assumed >51% chance of being
non-American.
Random padding, in fact, is no good at all. The adversary can simply
...
average over many pageloads and extract the true length.
Again, "no good at all" slides into this "absolute security" fallacy.  *How
much more difficult* does padding make things?  *How many* more pageloads?
 The adversaries with infinite resources can also legally compel the sysop
to compromise the server.  But can we improve the situation for
medium-sized state actors, or raise the bar so that only targeted users can
be compromised (instead of passively collecting information on all users)?
As a start on constructing a better threat model, let me offer two
scenarios:
a) NSA passive collection of all traffic to/from wikipedia (XKEYSCORE).  It
would be nice to frustrate this so that (as a start) only traffic from
targeted users could be effectively collected -- for example, by requiring
an active MIM attack instead of a passive tap.
b) Great Firewall monitoring of specific pages (Tienanmen square, Falun
Gong).  Can we better protect the identities of readers of these pages?
 Can we protect the identities of editors?  Can we frustrate attempts to
block specific pages?
Real world issues should also be taken into account.  Methods that prevent
the Great Firewall from blocking specific pages might provoke a site-wide
block.  Efforts to force utilization of the latest browsers (which support
some new protocol) might disenfranchise mobile users or users for whom
poverty and resource limitations are a bigger threat than coercive
government.  Etc...
 --scott
-- 
(http://cscott.net)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Wikimedia's anti-surveillance plans: traffic analysis resistance