Hi all,
as part of the ongoing project for collecting javascript errors [0] we have been collecting metrics about CORS enabled script loading support; that project is finished now, here is a short report. After the recent move to load everything from the same domain [1] enabling CORS is not needed anymore, but I figured the numbers could still be interesting.
tl;dr version: enabling CORS would probably cause problems in about 0.1% of our script loads.
== What is CORS enabled script loading? ==
Most people probably know CORS or Cross-Origin Resource Sharing [2][3] as a way of sending AJAX requests to a different domain. You send a normal AJAX request, the browser detects that it is going to a different domain than the website you are on (which could be used for CSRF [4] attacks) and refuses to return the results of the request unless the target server permits it by setting certain HTTP headers.
Actually, CORS - as defined by the WHATWG Fetch standard [5] - is more generic than that: it is a protocol on top of HTTP that can be used to add extra permissions to any kind of request. One way browsers make use of that is to provide a "crossorigin" HTML attribute [6], which can be set on certain elements to get more information about them.
Specifically, using <script crossorigin="anonymous" src="..."></script> instead of just <script src="..."></script> will mean that the browser uses a CORS request to fetch the script if it is on a different domain, and certain restrictions on error information will be lifted.
== Why did we care about it? ==
Browsers provide information about Javascript errors via the onerror event, which has various interesting uses. Unfortunately, this information is not available when the error happens in a script that is loaded from a different domain; we get a nondescript "Script error. line 0" instead. Fortunately, that limitation can be lifted in modern browsers by fetching the script via CORS. Unfortunately the CORS specification requires browsers to threat CORS authorization errors as network errors. In other words, if one requests a script with crossorigin="anonymous" and the response does not have the right CORS headers set, the browser will treat that as a 404 error and not run the script at all.
That does not sound like a big deal since we are loading most Javascript files from our own servers, and can fully control what headers are set, but we ran into occasional problems in the past when using CORS (MediaViewer uses CORS-enabled image loading to get access to certain performance statistics): some people use proxies or firewalls which strip CORS headers from the responses as some sort of misguided security effort, causing the request to fail. We wanted to know how many users would be affected by this if we loaded ResourceLoader scripts via CORS.
== What did we find? ==
For the last few months, we run the measurements on every 1 in 1000 pageloads, by downloading two small script files, one with and one without CORS. CORS loading failed but normal loading succeeded in 0.16% of the requests. (Only browser which were feature-detected to suport CORS were counted.) In 0.12% both failed and in 0.03% the CORS loading failed and the normal loading succeeded. (Exact queries can be found in [7], the logging code in [8] and the data in the ImageMetricsCorsSupport schema [9].)
So it seems that there is about 0.15% failure ratio for any script load (due to shaky connections and other network errors, probably), and enabling CORS for script loading would roughly double that failure ratio.
The numbers were reasonably stable over time; there was some geographical variance, with China leading with an 1% CORS failure rate, and all other countries in the 0-0.5% range.
[0] see https://www.mediawiki.org/wiki/Requests_for_comment/Server-side_Javascript_e... and https://phabricator.wikimedia.org/project/profile/976/ if you are interested in that. [1] https://phabricator.wikimedia.org/T95448 [2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS [3] http://www.w3.org/TR/cors/ [4] https://en.wikipedia.org/wiki/Cross-site_request_forgery [5] https://fetch.spec.whatwg.org/#http-cors-protocol [6] https://html.spec.whatwg.org/multipage/infrastructure.html#cors-settings-att... [7] https://phabricator.wikimedia.org/T507 [8] https://gerrit.wikimedia.org/r/#/c/230982/ [9] https://meta.wikimedia.org/wiki/Schema:ImageMetricsCorsSupport
On Sat, Aug 15, 2015 at 01:00:38AM -0700, Gergo Tisza wrote:
That does not sound like a big deal since we are loading most Javascript files from our own servers, and can fully control what headers are set, but we ran into occasional problems in the past when using CORS (MediaViewer uses CORS-enabled image loading to get access to certain performance statistics): some people use proxies or firewalls which strip CORS headers from the responses as some sort of misguided security effort, causing the request to fail. We wanted to know how many users would be affected by this if we loaded ResourceLoader scripts via CORS.
For Wikimedia sites, it is now impossible for proxies or firewalls to strip headers, after the switch to HTTPS-only. Was this analysis done before or during the HTTPS-only migration?
Faidon
On Sun, Aug 16, 2015 at 5:20 AM, Faidon Liambotis faidon@wikimedia.org wrote:
For Wikimedia sites, it is now impossible for proxies or firewalls to strip headers, after the switch to HTTPS-only. Was this analysis done before or during the HTTPS-only migration?
The data the 0.1% number is based on was collected from mid-April to this week. There is a chronological breakdown at T507#1530596 https://phabricator.wikimedia.org/T507#1530596.
Some firewalls add themselves as root CA and then do a man-in-the-middle on HTTPS connections; AFAIK we don't do certificate pinning yet.
wikitech-l@lists.wikimedia.org