Re: [Wikitech-l] Hardening WP/WM against traffic analysis (take two)

5 Jun 2014


      Hi Zack,
On Thu, Jun 05, 2014 at 12:45:11PM -0400, Zack Weinberg wrote:
...
I'd like to restart the conversation about hardening Wikipedia (or
possibly Wikimedia in general) against traffic analysis.  I brought
this up ... last November, I think, give or take a month?  but it got
lost in a larger discussion about HTTPS.
This sounds like a great idea to me, thanks for thinking about it 
and sharing it. Privacy of peoples' reading habits is critical, and 
the more we can do to ensure it the better.
...
With that data in hand, the next phase would be to develop some sort
of algorithm for automatically padding HTTP responses to maximize
eavesdropper confusion while minimizing overhead.  I don't yet know
exactly how this would work.  I imagine that it would be based on
clustering the database into sets of pages with similar length but
radically different contents.  The output of this would be some
combination of changes to MediaWiki core (for instance, to ensure that
the overall length of the HTTP response headers does not change when
one logs in) and an extension module that actually performs the bulk
of the padding.  I am not at all a PHP developer, so I would need help
from someone who is with this part.
I'm not a big PHP developer, but given the right project I can be 
enticed into doing some, and I'd be very happy to help out with 
this. Ensuring any changes didn't add complexity would be very 
important, but that should be do-able.
As was mentioned, external resources like variously sized images 
would probably be the trickiest thing to figure out good ways 
around. IIRC SPDY has some inlining multiple resources in the same 
packet sort of stuff, which we might be able to take advantage of to 
help here (it's been ages since I read about it, though).
Nick

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Hardening WP/WM against traffic analysis (take two)