Hello all,
As you might already know, YSlow is a tool to check website performance, I just run a test against: http://en.wikipedia.org/wiki/Main_Page
Results is quite surprising, Grade F (47). Of course lower mark does not always means bad, but there are some room for improvement, e.g.
1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
2. Enable GZip compression (e.g. http://en.wikipedia.org/skins-1.5/monobook/main.css?179)
3. Add expire header (e.g. http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
4. Don't put CSS out of the <head />
etc.
By do this, it should save some money on bandwidth, as well as to provide a better user experience.
Howard
On Sun, Oct 5, 2008 at 12:15 PM, howard chen howachen@gmail.com wrote:
Results is quite surprising, Grade F (47). Of course lower mark does not always means bad, but there are some room for improvement, e.g.
[snip]
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a Wikimedia data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
- Enable GZip compression (e.g.
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images, but not ones from upload. It does correctly respond with 304 not modified, but a not-modified is often as time consuming as sending the image. Firefox doesn't IMS these objects every time in any case.
The caching headers for the OggHandler play button are a bit odd and are causing the object to be refreshed on every load for me.
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
Hello,
On Mon, Oct 6, 2008 at 1:00 AM, Gregory Maxwell gmaxwell@gmail.com wrote:
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a Wikimedia data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
Given that the traffic of wikipedia, every bit should count. That are no reason to send to inline programming comments to normal user anyway?
- Enable GZip compression (e.g.
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
Gzip css/js `should` not bring any compatibility issue to most browsers, Yahoo! is doing anyway.
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images, but not ones from upload. It does correctly respond with 304 not modified, but a not-modified is often as time consuming as sending the image. Firefox doesn't IMS these objects every time in any case.
Have a simple policy to generate unique URI for each resources, and expire as far as possible.
Howard
On 05.10.2008, 21:00 Gregory wrote:
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a Wikimedia data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
mwsuggest.js loses 10 kb that way, wikibits.js - 11k.
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
For a logged in user with monobook it's 33k vs. 106 kb - not that insignificant.
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
Not everyone lives in the US and enjoys fast Internet.
On Sun, Oct 5, 2008 at 2:29 PM, Max Semenik maxsem.wiki@gmail.com wrote:
On 05.10.2008, 21:00 Gregory wrote:
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a Wikimedia data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
mwsuggest.js loses 10 kb that way, wikibits.js - 11k.
The gzipped copy can't lose 11k, because it's not even that large when gzipped (it's 9146 bytes gzipped).
Compare the gzipped sizes. Post gzipping the savings from whitespace removal and friends is much smaller. Yet it makes the JS unreadable and makes debugging into a pain.
For a logged in user with monobook it's 33k vs. 106 kb - not that insignificant.
Logged in a mess of uncachability. You're worried about one-time-per-session loaded object for logged in users?
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
Not everyone lives in the US and enjoys fast Internet.
You're missing my point. For small objects *latency* overwhelms the loading time, even if you're on a slow connection, because TCP never gets a chance to open the window up. The further you are away from the Wikimedia datacenters the more significant that effect is.
Much of the poorly connected world suffers very high latencies due to congestion induced queueing delay or service via satellite in addition to being far from Wikimedia. (and besides, the US itself lags much of the world in terms of throughput).
If it takes 75ms to get to the nearest Wikimedia datacenter and back then a new HTTP get can not finish in less than 150ms. If you want to improve performance you need to focus on shaving *round trips* rather than bytes. Byte reduction only saves you round trips if you're able to reduce the number of TCP windows worth of data, it's quantized and the lowest threshold is about 8kbytes.
Removing round trips helps everyone, while shaving bytes only helps people who are low delay and very low bandwidth, an increasingly uncommon configuration. Also, getting JS out of the critical path helps everyone. The reader does not care how long an once-per-session object takes to load when it doesn't block rendering, and the site already does really well at this.
Gregory Maxwell wrote:
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
That's because it's an uploaded image. It is cached in the squids, but not outside. So people will need to check if it's modified, but it can be modified at any time.
Howard Chen wrote:
Have a simple policy to generate unique URI for each resources, and expire as far as possible.
That's what is being used for site js and css (the appened query string) and that's why it can have the expire header.
We could provide a per-image unique URI with a large caching, based on some id or simply in the image hash. The tradeoff is that then you can set a large expiry time, but you need to purge all the pages including the image when it's reuploaded, whereas with the current system that would only be needed when it's deleted (or uploaded for the first time).
Meybe the html caches aren't even purged when the image is deleted on commons, given that there is no single table for that (the CheckUsage problem). Anyone can confirm? But it degrades gracefully, something a hash-based path wouldn't do.
On Sun, Oct 5, 2008 at 1:00 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
Some of the CSS/JS *is* gzipped, so there had better not be a client compatibility issue. Styles/scripts served from index.php are gzipped, it's only statically-served files that aren't. I'm guessing this would just require a line or two changed in Apache config.
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images, but not ones from upload. It does correctly respond with 304 not modified, but a not-modified is often as time consuming as sending the image.
Looking at the URL, this doesn't seem to identify the version at all, so we can't safely send an Expires header. I'm guessing we use a non-versioned URL here to avoid purging parser/Squid caches when a new image is uploaded. This is probably a losing proposition, except maybe for very widely used images (but those shouldn't change often?). But it would be a pain to change.
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
I've found that YSlow is kind of flaky in its grades. Some of its advice is fairly brainless. But there's definitely room for improvement on our part -- gzipping stuff at the very least!
On Sun, Oct 5, 2008 at 3:52 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
Compare the gzipped sizes. Post gzipping the savings from whitespace removal and friends is much smaller. Yet it makes the JS unreadable and makes debugging into a pain.
What I was thinking (not that the idea is original to me :P) is that we could have all JS and all CSS sent from some PHP file, call it compact.php. So you would have just one script tag and one style tag, like
<script type="text/javascript" src="/w/compact.php?type=js&..."></script> <link rel="stylesheet" type="text/css" href="/w/compact.php?type=css&..." />
That could concatenate the appropriate files, add on dynamically-generated stuff, gzip everything, and send it in one request, with appropriate Expires headers and so on. This would dramatically cut round-trips (15 total external CSS/JS files logged-out for me right now, 24 logged-in!).
Round-trips are serialized in modern browsers a lot more than they should be -- Firefox 3 will stop all other requests while it's requesting any JS file for some crazy reason (is there a bug open for that?), and IE is apparently similar. It's even worse with older browsers, which are unreasonably cautious about how many parallel requests to send to the same domain.
Once you're already serving all the JS together from a script, you could minify it with no real extra cost, and prepend a helpful comment like /* Add &minify=0 to the URL for a human-readable version */. Minification tends to save a few percent of the original filesize on top of gzipping (which can be like 20% of the file size after gzipping), which can make a difference for big scripts. Here are a couple of examples to illustrate the point (from the O'Reilly book "High Performance Web Sites", by Steve Souders of Yahoo!):
http://stevesouders.com/hpws/js-large-normal.php http://stevesouders.com/hpws/js-large-minify.php
I see a saving of a few hundred milliseconds according to those pages -- and yes, this is after gzipping.
Since this is you, though, I wait to be corrected. :)
If it takes 75ms to get to the nearest Wikimedia datacenter and back then a new HTTP get can not finish in less than 150ms. If you want to improve performance you need to focus on shaving *round trips* rather than bytes. Byte reduction only saves you round trips if you're able to reduce the number of TCP windows worth of data, it's quantized and the lowest threshold is about 8kbytes.
The example above has a script that's 76 KB gzipped, but only 29 KB minified plus gzipped. On the other hand, the script is three times as large as the scripts we serve to an anon viewing the main page (377 KB vs. 126 KB ungzipped).
I really wish people would stop spreading the crap about the /benefits/ of minification, while only giving half the information. Sure, minification does reduce some size in comparison to a full file. And yes, minification+gzipping does make things insanely small. But that there is blatantly disregarding something. It's not the minification that makes min+gz so small, it's the gzipping, in fact once you gzip trying to minify becomes nearly pointless.
Here's the table for wikibits.js, and wikipedia's gen.js for anons (basically monobook.js). wikibits.js non-gz gzipped full 27.1kb 8.9kb minified 16.7kb 5.0kb
wp's gen.js non-gz gzipped full 29.2kb 7.9kb minified 16.8kb 4.5kb
Minification alone only reduces a file by about 40%. However gzipping reduces a file by about 70% alone. When it comes down to it, once you gzip minification can barely even save you 10% of a file's size. And honestly, that measly 10% is not worth how badly it screws up the readability of the code.
As for client compatibility. There are some older browsers that don't support gzipping properly (notably ie6). We serve gzipped data from the php scripts, but we only do that after detecting if the browser supports gzipping or not. So we're not serving gzipped stuff to old browsers like ie6 that have broken handling of gzip. The difference with the static stuff is quite simply because it's not as easy to make a webserver detect gzip compatibility as it is to make a php script do it.
The limitation in grabbing data in browsers isn't crazy, the standard is to restrict to only 2 open http connections for a single hostname. Gzipping and reducing the number of script tags we use are the only useful things that can be done to speed up viewing.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) ~Profile/Portfolio: http://nadir-seen-fire.com -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Aryeh Gregor wrote:
On Sun, Oct 5, 2008 at 1:00 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
Some of the CSS/JS *is* gzipped, so there had better not be a client compatibility issue. Styles/scripts served from index.php are gzipped, it's only statically-served files that aren't. I'm guessing this would just require a line or two changed in Apache config.
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images, but not ones from upload. It does correctly respond with 304 not modified, but a not-modified is often as time consuming as sending the image.
Looking at the URL, this doesn't seem to identify the version at all, so we can't safely send an Expires header. I'm guessing we use a non-versioned URL here to avoid purging parser/Squid caches when a new image is uploaded. This is probably a losing proposition, except maybe for very widely used images (but those shouldn't change often?). But it would be a pain to change.
In any case, from the second page onwards pages typically display in <100ms for me, and the cold cache (first page) load time for me looks like it's about 230ms, which is also not bad. The grade 'f' is hardly deserved.
I've found that YSlow is kind of flaky in its grades. Some of its advice is fairly brainless. But there's definitely room for improvement on our part -- gzipping stuff at the very least!
On Sun, Oct 5, 2008 at 3:52 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
Compare the gzipped sizes. Post gzipping the savings from whitespace removal and friends is much smaller. Yet it makes the JS unreadable and makes debugging into a pain.
What I was thinking (not that the idea is original to me :P) is that we could have all JS and all CSS sent from some PHP file, call it compact.php. So you would have just one script tag and one style tag, like
<script type="text/javascript" src="/w/compact.php?type=js&..."></script>
<link rel="stylesheet" type="text/css" href="/w/compact.php?type=css&..." />
That could concatenate the appropriate files, add on dynamically-generated stuff, gzip everything, and send it in one request, with appropriate Expires headers and so on. This would dramatically cut round-trips (15 total external CSS/JS files logged-out for me right now, 24 logged-in!).
Round-trips are serialized in modern browsers a lot more than they should be -- Firefox 3 will stop all other requests while it's requesting any JS file for some crazy reason (is there a bug open for that?), and IE is apparently similar. It's even worse with older browsers, which are unreasonably cautious about how many parallel requests to send to the same domain.
Once you're already serving all the JS together from a script, you could minify it with no real extra cost, and prepend a helpful comment like /* Add &minify=0 to the URL for a human-readable version */. Minification tends to save a few percent of the original filesize on top of gzipping (which can be like 20% of the file size after gzipping), which can make a difference for big scripts. Here are a couple of examples to illustrate the point (from the O'Reilly book "High Performance Web Sites", by Steve Souders of Yahoo!):
http://stevesouders.com/hpws/js-large-normal.php http://stevesouders.com/hpws/js-large-minify.php
I see a saving of a few hundred milliseconds according to those pages -- and yes, this is after gzipping.
Since this is you, though, I wait to be corrected. :)
If it takes 75ms to get to the nearest Wikimedia datacenter and back then a new HTTP get can not finish in less than 150ms. If you want to improve performance you need to focus on shaving *round trips* rather than bytes. Byte reduction only saves you round trips if you're able to reduce the number of TCP windows worth of data, it's quantized and the lowest threshold is about 8kbytes.
The example above has a script that's 76 KB gzipped, but only 29 KB minified plus gzipped. On the other hand, the script is three times as large as the scripts we serve to an anon viewing the main page (377 KB vs. 126 KB ungzipped).
On Sun, Oct 5, 2008 at 6:35 PM, Daniel Friesen dan_the_man@telus.net wrote:
I really wish people would stop spreading the crap about the /benefits/ of minification, while only giving half the information. Sure, minification does reduce some size in comparison to a full file. And yes, minification+gzipping does make things insanely small. But that there is blatantly disregarding something. It's not the minification that makes min+gz so small, it's the gzipping, in fact once you gzip trying to minify becomes nearly pointless.
It can cut off a significant amount of extra size on top of gzipping, as my last post indicated, at least in some cases. It's not "nearly pointless".
Here's the table for wikibits.js, and wikipedia's gen.js for anons (basically monobook.js). wikibits.js non-gz gzipped full 27.1kb 8.9kb minified 16.7kb 5.0kb
wp's gen.js non-gz gzipped full 29.2kb 7.9kb minified 16.8kb 4.5kb
In other words, minification reduces the total size of those two files from 16.9 KB to 9.5 KB, after gzipping. That's more than 7 KB less. That's already not pointless, and it's probably only going to become less and less pointless over time as we use more and more scripts (which I'm guessing will happen).
And honestly, that measly 10% is not worth how badly it screws up the readability of the code.
How about my suggestion to begin the code with a comment "/* Append &minify=0 to the URL for a human-readable version */"?
As for client compatibility. There are some older browsers that don't support gzipping properly (notably ie6). We serve gzipped data from the php scripts, but we only do that after detecting if the browser supports gzipping or not. So we're not serving gzipped stuff to old browsers like ie6 that have broken handling of gzip. The difference with the static stuff is quite simply because it's not as easy to make a webserver detect gzip compatibility as it is to make a php script do it.
It should be reasonably easy to do a User-Agent check in Apache config, shouldn't it?
The limitation in grabbing data in browsers isn't crazy, the standard is to restrict to only 2 open http connections for a single hostname.
Yes, which ended up being crazy as the web evolved. :) That's fixed in recent browsers, though, with heavier parallelization of most files' loading. But Firefox <= 3 and IE <= 7 will still stop loading everything else when loading scripts. IE8 and recent WebKit thankfully no longer do this:
http://blogs.msdn.com/kristoffer/archive/2006/12/22/loading-javascript-files... http://webkit.org/blog/166/optimizing-page-loading-in-web-browser/
No, messed up table since I was using html mode.
wikibits.js non-gz gzipped full 27.1kb 8.9kb minified 16.7kb 5.0kb
wp's gen.js non-gz gzipped full 29.2kb 7.9kb minified 16.8kb 4.5kb
When not gzipped minification cuts something from 27kb to 16kb, but when already gzipped down to 8kb it only reduces it to 5kb...
Wikia does something similar to that idea... They have an allinone.js, and you use &allinone=0 to disable it. But honesly, there are cases where you use links, or something, or a post request or something else that can only be done once, you get a js error, and it's a pain to find out what's going on. Not when it only saves around 4kb.
Minification being pointless when gzipped is actually logical to understand if you know their principles. Both gzipping and minification follow the same principle. They take sequences that are repeated, create an optimized tree, and then store smaller sequences that refer to the data in that tree. Once something has been optimized like that, it's almost impossible to get it any smaller because you've already removed the repeat sequences of data. Basically trying to minify then gzip, is like trying to gzip twice, it can't technically give you much more.
Ohwait, scratch that... ^_^ I'm thinking of js packing... JS minification doesn't do anything except kill things like whitespace... That obviously can't save to much.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) ~Profile/Portfolio: http://nadir-seen-fire.com -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Aryeh Gregor wrote:
On Sun, Oct 5, 2008 at 6:35 PM, Daniel Friesen dan_the_man@telus.net wrote:
I really wish people would stop spreading the crap about the /benefits/ of minification, while only giving half the information. Sure, minification does reduce some size in comparison to a full file. And yes, minification+gzipping does make things insanely small. But that there is blatantly disregarding something. It's not the minification that makes min+gz so small, it's the gzipping, in fact once you gzip trying to minify becomes nearly pointless.
It can cut off a significant amount of extra size on top of gzipping, as my last post indicated, at least in some cases. It's not "nearly pointless".
Here's the table for wikibits.js, and wikipedia's gen.js for anons (basically monobook.js). wikibits.js non-gz gzipped full 27.1kb 8.9kb minified 16.7kb 5.0kb
wp's gen.js non-gz gzipped full 29.2kb 7.9kb minified 16.8kb 4.5kb
In other words, minification reduces the total size of those two files from 16.9 KB to 9.5 KB, after gzipping. That's more than 7 KB less. That's already not pointless, and it's probably only going to become less and less pointless over time as we use more and more scripts (which I'm guessing will happen).
And honestly, that measly 10% is not worth how badly it screws up the readability of the code.
How about my suggestion to begin the code with a comment "/* Append &minify=0 to the URL for a human-readable version */"?
As for client compatibility. There are some older browsers that don't support gzipping properly (notably ie6). We serve gzipped data from the php scripts, but we only do that after detecting if the browser supports gzipping or not. So we're not serving gzipped stuff to old browsers like ie6 that have broken handling of gzip. The difference with the static stuff is quite simply because it's not as easy to make a webserver detect gzip compatibility as it is to make a php script do it.
It should be reasonably easy to do a User-Agent check in Apache config, shouldn't it?
The limitation in grabbing data in browsers isn't crazy, the standard is to restrict to only 2 open http connections for a single hostname.
Yes, which ended up being crazy as the web evolved. :) That's fixed in recent browsers, though, with heavier parallelization of most files' loading. But Firefox <= 3 and IE <= 7 will still stop loading everything else when loading scripts. IE8 and recent WebKit thankfully no longer do this:
http://blogs.msdn.com/kristoffer/archive/2006/12/22/loading-javascript-files... http://webkit.org/blog/166/optimizing-page-loading-in-web-browser/
On Tue, Oct 7, 2008 at 11:18 AM, Daniel Friesen dan_the_man@telus.net wrote:
Basically trying to minify then gzip, is like trying to gzip twice, it can't technically give you much more.
nope nope nope, not only cut space, but also removing comments etc..
there are no reason to send the programming comments to users, at least 99.99999% of user don't need it, if you need it, probably you know how to get it, programming comments are completely not needed to render a page.
Consider a page like wikipedia as one of the most popular site in the world, every bit should count. It will save your bandwidth investments.
Howard
On Tue, Oct 7, 2008 at 7:20 AM, howard chen howachen@gmail.com wrote:
On Tue, Oct 7, 2008 at 11:18 AM, Daniel Friesen dan_the_man@telus.net wrote:
Basically trying to minify then gzip, is like trying to gzip twice, it can't technically give you much more.
nope nope nope, not only cut space, but also removing comments etc..
there are no reason to send the programming comments to users, at least 99.99999% of user don't need it, if you need it, probably you know how to get it, programming comments are completely not needed to render a page.
Consider a page like wikipedia as one of the most popular site in the world, every bit should count. It will save your bandwidth investments.
Howard
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I would support minification if we have a minify=0 parameter which we can specify to unminify if needed (like someone suggested above). Sometimes you just need to be able to read it.
-Chad
Hi! I'm trying to get the edittoken using a POST call through the next code :
$URL="es.wikipedia.org/w/api.php"; $ch = curl_init( ); curl_setopt($ch, CURLOPT_URL,"http://$URL"); curl_setopt($ch, CURLOPT_HEADER, 0);curl_setopt($ch, CURLOPT_POST, 1);curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $poststring["action"] = "query";$poststring["prop"] = "info|revisions";$poststring["intoken"] = "edit";$poststring["titles"] = "Portada";$data = curl_exec ($ch);
... but I always received the following response : edittoken="+"
anybody knows what's happening?
Thanks.
_________________________________________________________________ Llega la nueva temporada. Consulta las nuevas tendencias en MSN Estilo http://estilo.es.msn.com/moda/
Hi! I'm trying to get the edittoken using a POST call through the next code :
$URL="es.wikipedia.org/w/api.php"; $ch = curl_init( ); curl_setopt($ch, CURLOPT_URL,"http://$URL"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $poststring["action"] = "query"; $poststring["prop"] = "info|revisions"; $poststring["intoken"] = "edit"; $poststring["titles"] = "Portada"; $data = curl_exec ($ch);
... but I always received the following response : edittoken="+"
anybody knows what's happening?
Thanks.
PD : "Sorry for the last email format :-) ".
_________________________________________________________________ ¿Sigue el calor? Consulta MSN El tiempo http://eltiempo.es.msn.com/
javi bueno schreef:
Hi! I'm trying to get the edittoken using a POST call through the next code :
$URL="es.wikipedia.org/w/api.php"; $ch = curl_init( ); curl_setopt($ch, CURLOPT_URL,"http://$URL"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $poststring["action"] = "query"; $poststring["prop"] = "info|revisions"; $poststring["intoken"] = "edit"; $poststring["titles"] = "Portada"; $data = curl_exec ($ch);
... but I always received the following response : edittoken="+"
anybody knows what's happening?
Since you haven't logged in (or, if you have, aren't passing the login cookies), you'll be treated as an anonymous users, whose edit token is always +\ .
Roan Kattouw (Catrope)
Thanks for the response, but I was logged in. I think the problem is the loggin cookies. I don't know how to pass it. The parameter I am passing are the following ones :
1. action= query 2. prop = info|revisions 3. intoken = edit 4. titles = Main_Page
Do I must to pass another parameter? Which is it?
Thanks.
Date: Tue, 7 Oct 2008 16:19:14 +0200 From: roan.kattouw@home.nl To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] Problem with edittoken
javi bueno schreef:
Hi! I'm trying to get the edittoken using a POST call through the next code :
$URL="es.wikipedia.org/w/api.php"; $ch = curl_init( ); curl_setopt($ch, CURLOPT_URL,"http://$URL"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $poststring["action"] = "query"; $poststring["prop"] = "info|revisions"; $poststring["intoken"] = "edit"; $poststring["titles"] = "Portada"; $data = curl_exec ($ch);
... but I always received the following response : edittoken="+"
anybody knows what's happening?
Since you haven't logged in (or, if you have, aren't passing the login cookies), you'll be treated as an anonymous users, whose edit token is always +\ .
Roan Kattouw (Catrope)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_________________________________________________________________ Prueba los prototipos de los últimos en MSN Motor http://motor.es.msn.com/
javi bueno schreef:
Thanks for the response, but I was logged in. I think the problem is the loggin cookies. I don't know how to pass it. The parameter I am passing are the following ones :
- action= query
- prop = info|revisions
- intoken = edit
- titles = Main_Page
Do I must to pass another parameter? Which is it?
You have to pass the cookies, which aren't parameters. The CURL documentation probably has information how to fetch and pass cookies using CURL. The idea is that when you log in with action=login , you get a set of cookies along with the usual API response. You have to pass those cookies back in the action=query request and all subsequent requests that require you to be logged in.
Roan Kattouw (Catrope)
http://en.wikipedia.org/wiki/User:ClueBot/Source provides a good base for cookie storage.
Soxred
On Oct 7, 2008, at 11:52 AM [Oct 7, 2008 ], Roan Kattouw wrote:
javi bueno schreef:
Thanks for the response, but I was logged in. I think the problem is the loggin cookies. I don't know how to pass it. The parameter I am passing are the following ones :
- action= query
- prop = info|revisions
- intoken = edit
- titles = Main_Page
Do I must to pass another parameter? Which is it?
You have to pass the cookies, which aren't parameters. The CURL documentation probably has information how to fetch and pass cookies using CURL. The idea is that when you log in with action=login , you get a set of cookies along with the usual API response. You have to pass those cookies back in the action=query request and all subsequent requests that require you to be logged in.
Roan Kattouw (Catrope)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Soxred93 schreef:
http://en.wikipedia.org/wiki/User:ClueBot/Source provides a good base for cookie storage.
Soxred
You might wanna wrap those code blocks in <source lang="php">code here</source> so you get automatic fancy highlighting, which greatly improves readability.
Roan Kattouw (Catrope)
Thank u for all. Now, I think I understand all the process.
Javier Bueno.
Date: Tue, 7 Oct 2008 17:52:57 +0200> From: roan.kattouw@home.nl> To: wikitech-l@lists.wikimedia.org> Subject: Re: [Wikitech-l] Problem with edittoken> > javi bueno schreef:> > Thanks for the response, but I was logged in. I think the problem is the loggin cookies. I don't know how to pass it. The parameter I am passing are the following ones : > >> > 1. action= query> > 2. prop = info|revisions> > 3. intoken = edit> > 4. titles = Main_Page> >> > Do I must to pass another parameter? Which is it?> You have to pass the cookies, which aren't parameters. The CURL > documentation probably has information how to fetch and pass cookies > using CURL. The idea is that when you log in with action=login , you get > a set of cookies along with the usual API response. You have to pass > those cookies back in the action=query request and all subsequent > requests that require you to be logged in.> > Roan Kattouw (Catrope)> > _______________________________________________> Wikitech-l mailing list> Wikitech-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_________________________________________________________________ Prueba los prototipos de los últimos en MSN Motor http://motor.es.msn.com/
On Mon, Oct 6, 2008 at 11:18 PM, Daniel Friesen dan_the_man@telus.net wrote:
Minification being pointless when gzipped is actually logical to understand if you know their principles. Both gzipping and minification follow the same principle. They take sequences that are repeated, create an optimized tree, and then store smaller sequences that refer to the data in that tree. Once something has been optimized like that, it's almost impossible to get it any smaller because you've already removed the repeat sequences of data. Basically trying to minify then gzip, is like trying to gzip twice, it can't technically give you much more.
Sure it can, because minification is lossy, and knows the syntax of JavaScript. gzipping cannot know that most runs of whitespace characters and all comments are unnecessary for execution and can be stripped entirely.
Although you're right that minify=0 would not solve everything. For instance, if the error console gives an error as occurring on a specific line, that won't be helpful if you check the unminified version, and might be unhelpful if you check the minified version too (if everything is crammed onto a few lines).
I think the best way forward is to 1) concatenate all the JS/CSS files into one using a PHP script; 2) add a configurable minification option, set to false by default; and only then 3) experiment with how much faster/more inconvenient it would actually be to use minification on Wikipedia.
Although you're right that minify=0 would not solve everything. For instance, if the error console gives an error as occurring on a specific line, that won't be helpful if you check the unminified version, and might be unhelpful if you check the minified version too (if everything is crammed onto a few lines).
I don't see the problem. Unless the error is intermittent, you'll get the error with the minified version with a useless line number and then you'll add minify=0 to the url and will get the error with the unminified version which will give the line number for the unminified version.
On Tue, Oct 7, 2008 at 11:28 AM, Thomas Dalton thomas.dalton@gmail.com wrote:
I don't see the problem. Unless the error is intermittent, you'll get the error with the minified version with a useless line number and then you'll add minify=0 to the url and will get the error with the unminified version which will give the line number for the unminified version.
I was assuming that minify=0 would be a parameter for the combining script, not the whole page, but of course the latter could be done too. It seems prudent to make it optional anyway, though.
On Tue, Oct 7, 2008 at 5:37 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Tue, Oct 7, 2008 at 11:28 AM, Thomas Dalton thomas.dalton@gmail.com wrote:
I don't see the problem. Unless the error is intermittent, you'll get the error with the minified version with a useless line number and then you'll add minify=0 to the url and will get the error with the unminified version which will give the line number for the unminified version.
I was assuming that minify=0 would be a parameter for the combining script, not the whole page, but of course the latter could be done too. It seems prudent to make it optional anyway, though.
I am a Internet newbie, so maybe I don't understand the problem.
Desktop programmers often distribute his applications as binary compiled form version of his SRC files. I don't see what is the problem to distribute a application webpage has a binary file.. or a ofuscated file-optimized file. If having a separate (keyword here is: separate) download for the SRC files. Has programmers of Desktop applications have a separate folder labeled "SRC" with the SRC files of any binary distribution program.
Two folders: src/<all the SRC files goes here> bin/alljs_20081002.js bin/.htwhateverconfig
You can have a make process that recreate that alljs_<build date here>.js from the SRC files. The .htwhateverconfig make Apache send the right expiration http headers, so browsers are forced to cache a file called alljs_20081002.js forever (or til 2012).
I know you can still have poeple that may need the un-ofuscated original (not compiled) src. But he!.. developping sould work on a beta server or something, not on production. People geek enough to need the original src files, could google about it, and download the full src/ folder or something alike.
-- ℱin del ℳensaje.
On Wed, Oct 8, 2008 at 10:10 AM, Tei oscar.vives@gmail.com wrote:
I am a Internet newbie, so maybe I don't understand the problem.
Desktop programmers often distribute his applications as binary compiled form version of his SRC files.
JavaScript is a scripting language, not a compiled language. It cannot be distributed as a machine-executable binary, because that would be insecure: JavaScript needs to be sandboxed in a way that machine-executable code cannot be. It cannot be distributed as bytecode, the way Java can, because there's no standard JavaScript bytecode. It can only be distributed in source code form.
I don't see what is the problem to distribute a application webpage has a binary file.. or a ofuscated file-optimized file. If having a separate (keyword here is: separate) download for the SRC files. Has programmers of Desktop applications have a separate folder labeled "SRC" with the SRC files of any binary distribution program.
This works for languages like Java or C where the convention is to distribute more optimized forms than source code. The problem with JavaScript is that the entire infrastructure for viewing it and debugging it depends on the fact that it's distributed in source-code form. If an error occurs, the browser will give the line number in the source code: if your source code is minified so it's all on one line, this is useless. Debuggers like Firebug can set break points on individual lines, which again is useless for minified code. All of the above also present the code to you for inspection in source-code form, which is very difficult to read if it's minified, with all spaces and comments removed.
Again, in compiled languages (whether byte-code or machine code) this isn't such an issue. There are established ways of dealing with this: you can compile debug versions and your debugger will automatically use info on the line number and so on to jump to the correct place in the source code. No such facilities exist for JavaScript.
In the long term, the "correct" solution might be for the web community to come up with a standard JavaScript byte-code or something like that, and have debuggers updated to work with that. But that's not an option right now.
I know you can still have poeple that may need the un-ofuscated original (not compiled) src. But he!.. developping sould work on a beta server or something, not on production. People geek enough to need the original src files, could google about it, and download the full src/ folder or something alike.
It's a lot more difficult to reproduce problems on local wikis than to be able to inspect them in-place where they're reported, on production wikis. Moreover, a lot of our JavaScript is put into place by sysops on individual wikis, who mostly don't have access to test wikis. It's not practical to just say "don't give an option to view the readable source code". Yes, this is indeed a peculiarity of web development compared to other types of software development, but that's the facts on the ground.
On Wed, Oct 8, 2008 at 4:32 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote: ..
It's a lot more difficult to reproduce problems on local wikis than to be able to inspect them in-place where they're reported, on production wikis. Moreover, a lot of our JavaScript is put into place by sysops on individual wikis, who mostly don't have access to test wikis. It's not practical to just say "don't give an option to view the readable source code". Yes, this is indeed a peculiarity of web development compared to other types of software development, but that's the facts on the ground.
Ok, I see. So this why "&minify=0" is needed.
off-topic: humm... seems /. uses this stuff... <script src="//images.slashdot.org/all-minified.js?T_2_5_0_223" type="text/javascript"></script>
-- ℱin del ℳensaje.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Gregory Maxwell wrote:
On Sun, Oct 5, 2008 at 12:15 PM, howard chen howachen@gmail.com wrote:
Results is quite surprising, Grade F (47). Of course lower mark does not always means bad, but there are some room for improvement, e.g.
[snip]
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a Wikimedia data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
Minification can actually decrease sizes significantly even with gzipping. Particularly for low-bandwidth and mobile use this could be a serious plus.
The big downside of minification, of course, is that it makes it harder to read and debug the code.
- Enable GZip compression (e.g.
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win (see above) but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
CSS/JS generated via MediaWiki are gzipped. Those loaded from raw files are not, as the servers aren't currently configured to do that.
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images, but not ones from upload. It does correctly respond with 304 not modified, but a not-modified is often as time consuming as sending the image. Firefox doesn't IMS these objects every time in any case.
The primary holdup for serious expires headers on file uploads is not having unique per-version URLs. With a far-future expires header, things get horribly confusing when a file has been replaced, but everyone still sees the old cached version.
Anyway, these are all known issues.
Possible remedies for CSS/JS files: * Configue Apache to compress them on the fly (probably easy) * Pre-minify them and have Apache compress them on the fly (not very hard) * Run them through MediaWiki to compress them (slightly harder) * Run them through MediaWiki to compress them *and* minify them *and* merge multiple files together to reduce number of requests (funk-ay!)
Possible remedies for image URLs better caching: * Stick a version number on the URL in a query string (probably easy -- grab the timestamp from the image metadata and toss it on the url?) * Store files with unique filenames per version (harder since it requires migrating files around, but something I'd love us to do)
- -- brion
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Brion Vibber Sent: 06 October 2008 17:56 To: Wikimedia developers Subject: Re: [Wikitech-l] Wikipedia & YSlow
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Gregory Maxwell wrote:
On Sun, Oct 5, 2008 at 12:15 PM, howard chen
howachen@gmail.com wrote:
Results is quite surprising, Grade F (47). Of course lower
mark does
not always means bad, but there are some room for improvement, e.g.
[snip]
- Minify JS (e.g.
Probably pointless. It's small enough already that the load time is going to be latency bound for any user not sitting inside a
Wikimedia
data center. On ones which are above the latency bound window (of roughly 8k), gzipping should get them back under it.
Minification can actually decrease sizes significantly even with gzipping. Particularly for low-bandwidth and mobile use this could be a serious plus.
The big downside of minification, of course, is that it makes it harder to read and debug the code.
- Enable GZip compression (e.g.
The page text is gzipped. CSS/JS are not. Many of the CSS/JS are small enough that gzipping would not be a significant win
(see above)
but I don't recall the reason the the CSS/JS are not. Is there a client compatibility issue here?
CSS/JS generated via MediaWiki are gzipped. Those loaded from raw files are not, as the servers aren't currently configured to do that.
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
Hm. There are expire headers on the skin provided images,
but not ones
from upload. It does correctly respond with 304 not
modified, but a
not-modified is often as time consuming as sending the
image. Firefox
doesn't IMS these objects every time in any case.
The primary holdup for serious expires headers on file uploads is not having unique per-version URLs. With a far-future expires header, things get horribly confusing when a file has been replaced, but everyone still sees the old cached version.
Anyway, these are all known issues.
Possible remedies for CSS/JS files:
- Configue Apache to compress them on the fly (probably easy)
- Pre-minify them and have Apache compress them on the fly
(not very hard)
- Run them through MediaWiki to compress them (slightly harder)
- Run them through MediaWiki to compress them *and* minify
them *and* merge multiple files together to reduce number of requests (funk-ay!)
Possible remedies for image URLs better caching:
- Stick a version number on the URL in a query string
(probably easy -- grab the timestamp from the image metadata and toss it on the url?)
- Store files with unique filenames per version (harder since
it requires migrating files around, but something I'd love us to do)
Wouldn't rollbacks waste space? Would've thought content addressing to store all the images, and the content address in the url?
Unless there is versioned metadata associated with images that would affect how its sent to the client.
Jared
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Jared Williams wrote:
- Store files with unique filenames per version (harder since
it requires migrating files around, but something I'd love us to do)
Wouldn't rollbacks waste space?
Not if we follow the 2006 restructuring plan: http://www.mediawiki.org/wiki/FileStore
Storing the same file version multiple times would not require any additional filesystem space, just the extra DB row w/ the versioning info.
- -- brion
Brion Vibber wrote:
The big downside of minification, of course, is that it makes it harder to read and debug the code.
Debugging is needed by what, 0.01% of all users? I think that using &minify=0 suggestion when debugging is a very good solution. But Tim said that isn't going to happen...
CSS/JS generated via MediaWiki are gzipped. Those loaded from raw files are not, as the servers aren't currently configured to do that.
Talking about gzipping, something a little off-topic: Currently, the parsed output in the object cache is gzipped, and MediaWiki has to unzip it, insert it into the Monobook skin, then gzip again (in PHP zlib output compression). Did you consider taking a shortcut, and sending the compressed parser output directly from the cache to the client, compressing only the surroundings created by the skin?
- Stick a version number on the URL in a query string (probably easy --
grab the timestamp from the image metadata and toss it on the url?)
I think it is better to have a fragment (like, the first 32-bits) of the SHA1, so that rollbacks preserve the already-cached versions.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Juliano F. Ravasi wrote:
Talking about gzipping, something a little off-topic: Currently, the parsed output in the object cache is gzipped, and MediaWiki has to unzip it, insert it into the Monobook skin, then gzip again (in PHP zlib output compression). Did you consider taking a shortcut, and sending the compressed parser output directly from the cache to the client, compressing only the surroundings created by the skin?
That sounds like some scary gzip voodoo. :) gzip is pretty cheap (especially compared to all the surrounding network time); playing games like this is much more likely to break, assuming it even works at all.
- Stick a version number on the URL in a query string (probably easy --
grab the timestamp from the image metadata and toss it on the url?)
I think it is better to have a fragment (like, the first 32-bits) of the SHA1, so that rollbacks preserve the already-cached versions.
Doable, but wouldn't be a huge % of bandwidth probably.
- -- brion
howard chen wrote:
Hello all,
As you might already know, YSlow is a tool to check website performance, I just run a test against: http://en.wikipedia.org/wiki/Main_Page
Results is quite surprising, Grade F (47). Of course lower mark does not always means bad, but there are some room for improvement, e.g.
- Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
We've discussed this already. It's not happening.
- Enable GZip compression (e.g.
Yes this is possible. But there are two ways of doing it and Brion thinks it should be done the hard way ;)
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
You know this is a wiki, right?
- Don't put CSS out of the <head />
You mean style attributes? That's an editorial issue.
By do this, it should save some money on bandwidth, as well as to provide a better user experience.
Are you saying we're slow?
-- Tim Starling
On Sun, Oct 5, 2008 at 8:16 PM, Tim Starling tstarling@wikimedia.org wrote:
- Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
You know this is a wiki, right?
Clearly we'd need to use a versioned URL to do it, at least for widely-used images.
Are you saying we're slow?
I do regularly get pages taking ten to thirty seconds to load, but I don't think it has much to do with this sort of front-end optimization.
Results is quite surprising, Grade F (47). Of course lower mark does not always means bad, but there are some room for improvement, e.g.
did it mention we don't use CDN too?
On Tue, Oct 7, 2008 at 6:26 PM, Domas Mituzas midom.lists@gmail.com wrote:
did it mention we don't use CDN too?
Yep. Do we/should we? (We have the Squids in Amsterdam and Seoul, do those count?)
wikitech-l@lists.wikimedia.org