On Mon, Sep 8, 2008 at 6:17 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Mon, Sep 8, 2008 at 2:33 PM, Brion Vibber brion@wikimedia.org wrote:
Note that while loading of images over HTTP may reveal viewed pages (via referers, just like clicking on an external link will) it won't reveal passwords or session cookies.
According to RFC 2616 (section 15.1.3), it SHOULD NOT reveal Referers either, and AFAIK browsers do implement that. However, you could still probably work out what pages the person is viewing by just looking at which images are being loaded, in many cases.
I suspect that there are fairly few pages with images that can't be identified that way… and the discussion started with commons, so…
On Mon, Sep 8, 2008 at 3:04 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
On this subject, as part of the IPv6 testing I've run a JS tester on ENWP for a couple of months now which has determined that for hosts able to run the JS tester, protocol relative urls (i.e. <img src="//upload.wikimedia.org/foo.jpg"/>) work for all clients.
If protocol relatives turn out to be universally supported they would remove one problem from doing a native SSL deployment.
Why would one suspect that they're not universally supported?
Because it's exceptionally rare, and most people whom I would otherwise expect to know about it are surprised that it both works and is part of the relevant standards.
On Mon, Sep 8, 2008 at 6:35 PM, Brion Vibber brion@wikimedia.org wrote:
It's the kind of weird, uncommon thing that just screams "I'm a corner case that lots of people probably didn't bother to implement because I'm not in common use in the wild". :)
Even if browsers support it I would expect to see a lot of bots and spiders choke on it -- it's bad enough a lot don't understand that "&" in an <a href="..."> needs to be decoded as "&"... :)
Right. Exactly. Although I'd only expect Wikimedia to need to use it for the few things that demand full URLs these days: Images and cross project links. Breaking some spiders with respect to those would probably not be the end of the world.
I do not know, however, if the damage would be limited to spiders. The modern JS-able browsers all appear fine, and the usual non-JS subjects that I was able to test were fine. But there are a lot of clients out there which I am not able to test.
In any case... The protocol-relatives are not necessary for HTTPS on the regular domains, but they would remove the need for making separately parsed copies of pages.
With protocol relatives, native HTTP support requires solving:
1) Wildcard SSL certificates 2) Dumb SSL front-ending proxy to do crypto 3) Either making the load balancer highly IP-sticky *or* setting up software for distributing the SSL session cache (i.e. http://distcache.sourceforge.net/).
Without protocol relatives, you have the above issues plus:
4) Running an additional set of backends that parse pages in a way that uses the HTTPS form for all wikimedia links.