Introducting my own working theory here, ignore if you wish.
I'd think that the *first* thing that would have to happen is that the page and the images it contains would have to be delivered in one stream. There are both HTML5 (resource bundling) and protocol (SPDY) mechanisms for doing this. Some URL rearrangement to consolidate hosts might be required as well. But in the interest of being forward-looking I'd suggest taking some implementation of resource bundling as a given. Similarly, I wouldn't waste time profiling both compressed and uncompressed data; assume the bundle is compressed.
That leaves a simpler question: if the contents of a page (including images) are delivered as a compressed bundle over a single host connection, how easy is it to tell the what page I'm looking at from the size of the response? How much padding would need to be added to bundles to foil the analysis?
Some extensions: 1) Assume that it's not just a single page at a time, but a sequence of N linked pages. How does the required padding depend on N? 2) Can correlations be made? For example, once I've visited page A which contained image X, visiting page B which also has image X will not require X to be re-downloaded. So can I distinguish certain pairs of "long request/short request"? 3) What if the attacker is not interested in specific pages, but instead in enumerating users looking at a *specific* page (or set of pages). That is, instead of looking at averages across all pages, look at the "unusual" outliers. 4) Active targetted attack: assume that the attacker can modify pages or images referenced by pages. This makes it easier to tell when specific pages are accessed. How can this be prevented/discouraged? 5) As you mentioned, disclosing user identity from read-only traffic patterns is also a concern.
I'm afraid the answers will be that traffic analysis is still very easy. But it still be useful to know *how* easy. --scott