On 17/08/13 10:04, Zack Weinberg wrote:
What's actually needed is to *bin* page (+resource) sizes such that any given load could be a substantial number of different pages.
That would certainly help for some sites, but I don't think it would make much difference for Wikipedia. Even if you used power-of-two bins, you would still be leaking tens of bits per page view due to the sizes and request order of the images on each page.
It would mean that instead of just crawling the HTML, the attacker would also have to send a HEAD request for each image. Maybe it would add 50% to the attacker's software development time, but once it was done for one site, it would work against any similar site.
-- Tim Starling