Re: [Wikitech-l] Future of Javascript and mediaWiki

16 Dec 2008

...
  -----Original Message-----
 From: wikitech-l-bounces(a)lists.wikimedia.org 
 [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of 
 Aryeh Gregor
 Sent: 16 December 2008 01:49
 To: Wikimedia developers
 Subject: Re: [Wikitech-l] Future of Javascript and mediaWiki

 On Mon, Dec 15, 2008 at 6:05 PM, Jared Williams 
 &lt;jared.williams1(a)ntlworld.com&gt; wrote:
  The idea being you could get a sdch capable user
agent to   download the 
  concated & gzipped javascript in a single
request (called a 
 dictionary), which quick testing is about 15kb for  
 en.mediawiki.org, 
  cache that on the client for a long period.

 Except that if you can do that, you can fetch the ordinary JS 
 anyway and keep it until it expires.  If clients actually did 
 this, you'd be talking about single-digit JS requests per 
 client per week, and further optimization would be pointless. 
  The fact of the matter is that a significant percentage of 
 views have no cached files from Wikipedia, for whatever 
 reason.  Those are the ones we're concerned with, and your 
 idea does nothing to help them. 
Technically it is possible to remove some of the JS & CSS roundtrips, as
inlining the JS & CSS no longer comes at a  download cost, the cost of
copying something from the dictionary is around 3-5 bytes. 

...

  Also possible to put the static (inline html) in
templates into the 
 dictionary, together with a lot of the translated messages,   to try and 
  reduce the HTML size, though not sure how
effective it'd be.  
 It looks to me like you're talking about hugely increasing 
 the size of the first page view on the optimistic assumption 
 that first page views account for only a very small 
 percentage of hits.  Why do you think that assumption is warranted? 
The SDCH protocol has a time delay built in, so it wouldn't download the 
dictionary unless the browser has remained on a page (that has requested the

client download the dictionary) for a time.

...

 One possible use of SDCH is that it would allow us to refrain 
 from re-transmitting the redundant parts of each response, 
 most of the skin bits.  But that's probably under a couple of 
 KB gzipped, so we're not talking about much savings here.  It 
 might even take longer to compute and apply the diffs than to 
 transmit the extra data, over typical hardware. 
Naively tried, was around 3kb for monobook. 
Hence my pessimism in my previous post, about beating plain gzip. 
The ratio of wikitext article output to skin bits is too high, I think.

...

 On Mon, Dec 15, 2008 at 7:57 PM, Jared Williams 
 &lt;jared.williams1(a)ntlworld.com&gt; wrote:
  Only needs to download it once, and kept on
client for months, as 
 server can send a diff to update to the most recent version at any 
 time, rather than having to send a whole new js file.  
 What leads you to believe that the client will actually 
 *keep* it for months?  Caches on clients are of finite size 
 and things get evicted.
 The oldest file I have in Firefox's cache right now is from 
 December 12, three days ago:

 $ ls -lt /home/aryeh/.mozilla/firefox/7y6zjmhm.default/Cache | tail
 -rw------- 1 aryeh aryeh    22766 2008-12-12 09:34 14C66B37d01
 -rw------- 1 aryeh aryeh    28914 2008-12-12 09:34 20916AFCd01
 -rw------- 1 aryeh aryeh    59314 2008-12-12 09:34 57BFF8A6d01
 -rw------- 1 aryeh aryeh    39826 2008-12-12 09:34 9EB67039d01
 -rw------- 1 aryeh aryeh    52145 2008-12-12 09:34 A2F75899d01
 -rw------- 1 aryeh aryeh    18072 2008-12-12 09:34 C11C3B29d01
 -rw------- 1 aryeh aryeh    25590 2008-12-12 09:34 C845BED2d01
 -rw------- 1 aryeh aryeh    32173 2008-12-12 09:34 D69FA933d01
 -rw------- 1 aryeh aryeh    45189 2008-12-12 09:34 F2AD5614d01
 -rw------- 1 aryeh aryeh    25164 2008-12-12 09:34 DBA05B7Bd01 
Dictionaries don't seem to get deleted when clearing cache atleast.
They have their own mechanism to specify how long they should be kept.

...

 A fair percentage of computers might be even worse, like 
 public computers configured to clean cache and other private 
 data when the user logs out.

 Concatenating the (user-independent) wikibits.js and so on to the
 (user-dependent) page HTML will prohibit intermediate proxies 
 from caching the user-invariant stuff, too, since the entire 
 thing will need to be marked Cache-Control: private.  It will 
 prevent our own Squids from caching the CSS and JS, in fact. 
Why would you concatenate user-independent with user-dependent HTML?

I think I only suggested skin dependant, and language dependant dictionary.

...

  Also would want the translation text from 
 \languages\messages\Messages*.php in there too I think.   Handling the 
  $1 style placeholders is easy, its just
determining what   message goes 
  through which wfMsg*() function, and if the
WikiText   translations can be preconverted to html.

 What happens if you have parser functions that depend on the value of
 $1 (allowed in some messages AFAIK)?  What if $1 contains 
 wikitext itself (I wouldn't be surprised if that were true 
 somewhere)?  How do you plan to do this substitution anyway, 
 JavaScript?  What about clients that don't support JavaScript?

The substitution doesn't really happen, as its done server side,
translations are essentially preparsed, saving time in template rendering.

For example

http://www.amazon.com/exec/obidos/ISBN=$1

http://www.amazon.com/exec/obidos/ISBN= would be put into the dictionary,
and server metadata for the translation becomes... 

'Amazon.com' => array(
	array(offset, 39),			// COPY 39 bytes from
dictionary offset
	1					// Output value of parameter
1.
);

I do have working PHP code, That can parse PHP templates & language strings
to generate the dictionary, and a new set of templates rewritten to output
the vcdiff efficiently.

Jared

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Future of Javascript and mediaWiki