On Wed, Sep 16, 2015 at 2:18 PM, Jon Robson jrobson@wikimedia.org wrote:
Sam asked me to write up my recent adventures with ServiceWorkers and making requests for MediaWiki content super super fast so all our lovely users can access information quicker. Right now we're trying to make the mobile site ridiculously fast by using new shiny standard web technologies.
The biggest issue we have on the mobile site right now is we ship a lot of content - HTML and images - since we ship those on desktop. On desktop it's not really a problem from a performance perspective but it may be an issue from a download perspective if you have some kind of data limit on your broadband and you are addicted to Wikipedia.
The problem however is that on mobile, the connection speeds are not quite up to desktops standard. To take an example the Barack Obama article contains 102 image tags and 186KB of HTML resulting in about 1MB of HTML
Correction: s/HTML/content
. If you're on your mobile phone just to look up his place
of birth (which is in the lead section) or to see the County results of the 2004 U.S. Senate race in Illinois [1] that's a lot of unnecessary stuff you are forced to load. You have to load all the images and all the text! Owch!
Gilles D said a while back [2] "The Barack Obama article might be a bit of an extreme example due to its length, but in that case the API data needed for section 0's text + the list of sections is almost 30 times smaller than the data needed for all sections's text (5.9kb gzipped versus 173.8kb gzipped)."
Somewhat related, some experimenting with webpagetest.org has suggested that disabling images on this page has a serious impact on first paint (which we believe is due to too many simultaneous connections) [3,4]
Given that ServiceWorker is here (in Chrome first [5] but hopefully others soon) I wrote a small proof of concept that lazy loads images to expose myself to this promising technology.
For those interested I've documented my idea here: https://en.m.wikipedia.org/wiki/User:Jdlrobson/lazyloader but basically what is does is:
- intercept network requests for HTML
- Rewrites the src and srcset attributes to data-src and data-srcset attributes
- Uses JavaScript to lazy load images when they appear in the screen.
- Without JS the ServiceWorker doesn't run so the web remains unbroken
(But as Jake Archibald points out though there are downsides to this approach [6].)
It doesn't quite work as a user script due to how scope works in service workers but if we want to use these in production we can use a header [7] to allow use of scope: '/' so if we wanted to do this in production there's no real problem with that, but we will have to ensure we can accurately measure that... [8]
A more radical next step for ServiceWorkers would be to intercept network requests for HTML to use an API to serve just the lead section [9]. This won't help first ever loads from our users, but it might be enough to get going quickly.
If we want to target that first page load we need to really rethink a lot of our parser architecture.... fun times.
Would this be a good topic to bring up in January at the dev summit?
[1] https://en.m.wikipedia.org/wiki/Barack_Obama#/media/File:2004_Illinois_Senat... [2] https://phabricator.wikimedia.org/T97570 [3] https://phabricator.wikimedia.org/T110615 [4] https://phabricator.wikimedia.org/T105365#1477762 [5] https://jakearchibald.com/2014/using-serviceworker-today/ [6] https://twitter.com/jaffathecake/status/644168091216310273 [7] https://gerrit.wikimedia.org/r/#/c/219960/8/includes/resourceloader/Resource... [8] https://phabricator.wikimedia.org/T112588 [9] https://phabricator.wikimedia.org/T100258