On Mon, Dec 15, 2008 at 6:05 PM, Jared Williams
<jared.williams1(a)ntlworld.com> wrote:
The idea being you could get a sdch capable user agent
to download the
concated & gzipped javascript in a single request (called a dictionary),
which quick testing is about 15kb for
en.mediawiki.org, cache that on the
client for a long period.
Except that if you can do that, you can fetch the ordinary JS anyway
and keep it until it expires. If clients actually did this, you'd be
talking about single-digit JS requests per client per week, and
further optimization would be pointless. The fact of the matter is
that a significant percentage of views have no cached files from
Wikipedia, for whatever reason. Those are the ones we're concerned
with, and your idea does nothing to help them.
Also possible to put the static (inline html) in
templates into the
dictionary, together with a lot of the translated messages, to try and
reduce the HTML size, though not sure how effective it'd be.
It looks to me like you're talking about hugely increasing the size of
the first page view on the optimistic assumption that first page views
account for only a very small percentage of hits. Why do you think
that assumption is warranted?
One possible use of SDCH is that it would allow us to refrain from
re-transmitting the redundant parts of each response, most of the skin
bits. But that's probably under a couple of KB gzipped, so we're not
talking about much savings here. It might even take longer to compute
and apply the diffs than to transmit the extra data, over typical
hardware.
On Mon, Dec 15, 2008 at 7:57 PM, Jared Williams
<jared.williams1(a)ntlworld.com> wrote:
Only needs to download it once, and kept on client for
months, as server can
send a diff to update to the most recent version at any time, rather than
having to send a whole new js file.
What leads you to believe that the client will actually *keep* it for
months? Caches on clients are of finite size and things get evicted.
The oldest file I have in Firefox's cache right now is from December
12, three days ago:
$ ls -lt /home/aryeh/.mozilla/firefox/7y6zjmhm.default/Cache | tail
-rw------- 1 aryeh aryeh 22766 2008-12-12 09:34 14C66B37d01
-rw------- 1 aryeh aryeh 28914 2008-12-12 09:34 20916AFCd01
-rw------- 1 aryeh aryeh 59314 2008-12-12 09:34 57BFF8A6d01
-rw------- 1 aryeh aryeh 39826 2008-12-12 09:34 9EB67039d01
-rw------- 1 aryeh aryeh 52145 2008-12-12 09:34 A2F75899d01
-rw------- 1 aryeh aryeh 18072 2008-12-12 09:34 C11C3B29d01
-rw------- 1 aryeh aryeh 25590 2008-12-12 09:34 C845BED2d01
-rw------- 1 aryeh aryeh 32173 2008-12-12 09:34 D69FA933d01
-rw------- 1 aryeh aryeh 45189 2008-12-12 09:34 F2AD5614d01
-rw------- 1 aryeh aryeh 25164 2008-12-12 09:34 DBA05B7Bd01
A fair percentage of computers might be even worse, like public
computers configured to clean cache and other private data when the
user logs out.
Concatenating the (user-independent) wikibits.js and so on to the
(user-dependent) page HTML will prohibit intermediate proxies from
caching the user-invariant stuff, too, since the entire thing will
need to be marked Cache-Control: private. It will prevent our own
Squids from caching the CSS and JS, in fact.
Also would want the translation text from
\languages\messages\Messages*.php
in there too I think. Handling the $1 style placeholders is easy, its just
determining what message goes through which wfMsg*() function, and if the
WikiText translations can be preconverted to html.
What happens if you have parser functions that depend on the value of
$1 (allowed in some messages AFAIK)? What if $1 contains wikitext
itself (I wouldn't be surprised if that were true somewhere)? How do
you plan to do this substitution anyway, JavaScript? What about
clients that don't support JavaScript?