wikipedia is one of the slower sites on the web

List overview All Threads
Download

newer

older

Static dump of German Wikipedia

Accuracy of coordinates in...

jidanni＠jidanni.org

28 Jul 2010 28 Jul '10

9:13 p.m.

Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

Maybe Mediawiki is not "optimized".

Show replies by date

Aryeh Gregor

28 Jul 28 Jul

9:37 p.m.

On Wed, Jul 28, 2010 at 3:13 PM, jidanni@jidanni.org wrote:

...

Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

Maybe Mediawiki is not "optimized".

Is this logged in or not? If you're not logged in, you should be hitting Squid cache most of the time, and we should be about as fast as anyone with similar RTT. But you might easily be far away from the nearest Wikipedia server than the nearest Facebook server. And if you're logged in, I'm betting we're much less optimized -- certainly if you have unusual parser preferences (which I'm sure you do), so you miss the parser cache regularly.

Strainu

29 Jul 29 Jul

10:07 p.m.

...

And if you're logged in, I'm betting we're much less optimized -- certainly if you have unusual parser preferences (which I'm sure you do), so you miss the parser cache regularly.

Could you please elaborate on that? Thanks.

Andrei

Domas Mituzas

10:12 p.m.

...

Could you please elaborate on that? Thanks.

we don't have large blinking red lights when people deviate with their parser cache settings - that makes them miss the cache and each pageview is slow.

Domas

Aryeh Gregor

10:23 p.m.

On Thu, Jul 29, 2010 at 4:07 PM, Strainu strainu10@gmail.com wrote:

...

Could you please elaborate on that? Thanks.

When pages are parsed, the parsed version is cached, since parsing can take a long time (sometimes > 10 s). Some preferences change how pages are parsed, so different copies need to be stored based on those preferences. If these settings are all default for you, you'll be using the same parser cache copies as anonymous users, so you're extremely likely to get a parser cache hit. If any of them is non-default, you'll only get a parser cache hit if someone with your exact parser-related preferences viewed the page since it was last changed; otherwise it will have to reparse the page just for you, which will take a long time.

This is probably a bad thing. I'd think that most of the settings that fragment the parser cache should be implementable in a post-processing stage, which should be more than fast enough to run on parser cache hits as well as misses. But we don't have such a thing.

Domas Mituzas

10:31 p.m.

...

This is probably a bad thing. I'd think that most of the settings that fragment the parser cache should be implementable in a post-processing stage, which should be more than fast enough to run on parser cache hits as well as misses. But we don't have such a thing.

some of which can be even done with css/js, I guess. I'm all for simplifying whatever processing backend has to do :-)

Domas

Platonides

30 Jul 30 Jul

12:19 a.m.

Domas Mituzas wrote:

...

...
This is probably a bad thing. I'd think that most of the settings that fragment the parser cache should be implementable in a post-processing stage, which should be more than fast enough to run on parser cache hits as well as misses. But we don't have such a thing.

some of which can be even done with css/js, I guess. I'm all for simplifying whatever processing backend has to do :-)

Domas

We have a couple of options: {$edit}{$printable} which do in fact the same (remove the sections edit links), so they could be merged. Additionally, the non-editsection version can be retrieved from the editsectioned one with a preg_replace. So yes, I think it can be simplified without even affecting the poor CSSless users.

Alex Brollo

7:26 a.m.

2010/7/30 Platonides Platonides@gmail.com

...

We have a couple of options: {$edit}{$printable} which do in fact the same (remove the sections edit links), so they could be merged. Additionally, the non-editsection version can be retrieved from the editsectioned one with a preg_replace. So yes, I think it can be simplified without even affecting the poor CSSless users.

Perhaps you're telling the same I'm going to suggest,... My idea is, to have online a static version, very fast too of any page, that could be the default version for unlogged users; very similar to the CD static version of wiki projects, only adding some trick to switch to the normal, editable, complete, customable (but slow) version. Obviusly this version would have only one version of any page, with no need to parse it again according to user preferences.

Don't matter if such an idea is completely fool, I'm far form an expert!

Alex

Daniel Friesen

8:32 a.m.

Alex Brollo wrote:

...

2010/7/30 Platonides Platonides@gmail.com

...
We have a couple of options: {$edit}{$printable} which do in fact the same (remove the sections edit links), so they could be merged. Additionally, the non-editsection version can be retrieved from the editsectioned one with a preg_replace. So yes, I think it can be simplified without even affecting the poor CSSless users.

Perhaps you're telling the same I'm going to suggest,... My idea is, to have online a static version, very fast too of any page, that could be the default version for unlogged users; very similar to the CD static version of wiki projects, only adding some trick to switch to the normal, editable, complete, customable (but slow) version. Obviusly this version would have only one version of any page, with no need to parse it again according to user preferences.

Don't matter if such an idea is completely fool, I'm far form an expert!

Alex

That's pretty much the purpose of the caching servers.

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Alex Brollo

8:42 a.m.

2010/7/30 Daniel Friesen lists@nadir-seen-fire.com

...

That's pretty much the purpose of the caching servers.

Yes, but I presume that a big advantage could come from having a simplified, unique, js-free version of the pages online, completely devoid of "user preferences" to avoid any need to parse it again when uploaded by different users with different preferences profile. Nevertheless I say again: it's only a completely layman idea.

-- Alex

Domas Mituzas

10:13 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

Hi!

...

...
That's pretty much the purpose of the caching servers.

Yes, but I presume that a big advantage could come from having a simplified, unique, js-free version of the pages online, completely devoid of "user preferences" to avoid any need to parse it again when uploaded by different users with different preferences profile. Nevertheless I say again: it's only a completely layman idea.

I can a bit elaborate on what Daniel said.

Whenever anyone edits a page, there're (simplistic view) of three caches that get populated.

1. Revision text cache (for operations like diffs, re-parsing for other settings, etc) 2. Parser cache (for logged in users) 3. Edge HTTP cache - squid (for anonymous users)

So, anonymous users get "completely devoid of user preferences" pages, as they are simply defaults. Do note, even though squid cache objects can vary based on accept-encoding (we narrowed it down to two versions from 10 few years ago ;-), they map to single parser cache object.

Parser cache hit may take under 50ms to deliver by backend mediawiki.

Logged in users though bypass our squids, if they don't mess up with their preferences, usually hit same parser cache objects - there's an extremely high chance of that. Now, if you change single setting that affects parser cache variation, the 'extremely high chance' switches to missing those objects - as there has to be someone with same settings as you to visit it before.

Delivering parser cache miss may take from 50ms to 50s, which is what jidanni probably hits.

So, we may have 1000x slower performance for our users because they don't really know about our caching internals. Our only hope is that most of them are also ignorant that those settings exist ;-)

There'd be of course another workaround - precaching objects for every variation, at extremely high cost for relatively low impact. Alternative is either having warning icon whenever people are in slow-perf mode that they'd be able to hide, or eliminating the choice (you know, the killing features business, that quite often works really well!!! ;-)

Domas

Alex Brollo

10:38 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

2010/7/30 Domas Mituzas midom.lists@gmail.com

...

I can a bit elaborate on what Daniel said.

Whenever anyone edits a page, there're (simplistic view) of three caches that get populated.

Revision text cache (for operations like diffs, re-parsing for other

settings, etc) 2. Parser cache (for logged in users) 3. Edge HTTP cache - squid (for anonymous users)

So, anonymous users get "completely devoid of user preferences" pages, as they are simply defaults. Do note, even though squid cache objects can vary based on accept-encoding (we narrowed it down to two versions from 10 few years ago ;-), they map to single parser cache object.

[...]

As I told, I'm far from deep into those stuffs (I wonder why I'm listed here, ;-) if I can understand perhaps 5% of talk contents).

So I got a simple try: I unlogged myself from beloved it.source so pulling away all my css and js tricks, and I "reload" the page, then I reload again.

My browser, while reloading, is forced to get data from a dozen of differents URLS: it runs from it.source to en.source to bits.wikimedia.org then again and then again here and there.... needed time to reload a simple, very simple web page: 12 s.

I guess, that if a plain html + css cached version (without any default js and perhaps with a single, included css section) of the page could be found, such a time would be much shorter.

Alex

Domas Mituzas

11:10 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

Hi!

Do note, once you log out, you still have a cookie that prohibits edge caching, I think ;-)

...

My browser, while reloading,

Don't hit "reload" button, thats what it does - reloads all assets.

...

bits.wikimedia.org then again and then again here and there.... needed time to reload a simple, very simple web page: 12 s.

What kind of connection do you have? On a simple eastern european dsl I get under 1s rendering times.

...

I guess, that if a plain html + css cached version (without any default js and perhaps with a single, included css section) of the page could be found, such a time would be much shorter.

Though indeed we could have some more work done for first-load performance, which you are measuring with 'reload', it may not be absolute priority, as no skin assets are loaded on any subsequent page views.

Domas

Alex Brollo

11:23 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

2010/7/30 Domas Mituzas midom.lists@gmail.com

...

What kind of connection do you have? On a simple eastern european dsl I get under 1s rendering times.

A much slower one (here where I'm testing times) by a proxy. :-(

-- Alex

Aryeh Gregor

6:18 p.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

On Fri, Jul 30, 2010 at 4:13 AM, Domas Mituzas midom.lists@gmail.com wrote:

...

So, we may have 1000x slower performance for our users because they don't really know about our caching internals. Our only hope is that most of them are also ignorant that those settings exist ;-)

There'd be of course another workaround - precaching objects for every variation, at extremely high cost for relatively low impact. Alternative is either having warning icon whenever people are in slow-perf mode that they'd be able to hide, or eliminating the choice (you know, the killing features business, that quite often works really well!!! ;-)

Or we could just store an intermediate form in the parser cache, and apply the settings afterwards. For instance, one preference is "enable section edit links". If instead of outputting HTML, the parser stuck a string like "\001SECTIONEDIT1\001" where the first section edit link goes, we could do preg_replace($page, '/\001SECTIONEDIT(\d+)\001/", $replacement), where $replacement = 'blah blah blah $1 blah blah blah' or '' according to user preference. Then we could use the same parser cache for everyone. I think almost all if not all the parser-changing prefs could be implemented this way, preg_replace_callback() at worst.

So we don't have to remove features, probably. In fact, we can even add features, like {{USERNAME}}. It wouldn't work for {{#ifeq:{{USERNAME}}|Simetrical|You're awesome!|}} or anything, but fine for "Hello, {{USERNAME}}, welcome to Wikipedia!" As long as we keep it down to preg_replace(), or better yet require it to be one big single-pass strtr() for all such settings, it should have no noticeable performance impact even if we add lots and lots of features like this.

David Goodman

7:42 p.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

Which of the preference settings are likely to cause this problem?

On Fri, Jul 30, 2010 at 12:18 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:

...

On Fri, Jul 30, 2010 at 4:13 AM, Domas Mituzas midom.lists@gmail.com wrote:

...
So, we may have 1000x slower performance for our users because they don't really know about our caching internals. Our only hope is that most of them are also ignorant that those settings exist ;-)

There'd be of course another workaround - precaching objects for every variation, at extremely high cost for relatively low impact. Alternative is either having warning icon whenever people are in slow-perf mode that they'd be able to hide, or eliminating the choice (you know, the killing features business, that quite often works really well!!! ;-)

Or we could just store an intermediate form in the parser cache, and apply the settings afterwards. For instance, one preference is "enable section edit links". If instead of outputting HTML, the parser stuck a string like "\001SECTIONEDIT1\001" where the first section edit link goes, we could do preg_replace($page, '/\001SECTIONEDIT(\d+)\001/", $replacement), where $replacement = 'blah blah blah $1 blah blah blah' or '' according to user preference. Then we could use the same parser cache for everyone. I think almost all if not all the parser-changing prefs could be implemented this way, preg_replace_callback() at worst.

So we don't have to remove features, probably. In fact, we can even add features, like {{USERNAME}}. It wouldn't work for {{#ifeq:{{USERNAME}}|Simetrical|You're awesome!|}} or anything, but fine for "Hello, {{USERNAME}}, welcome to Wikipedia!" As long as we keep it down to preg_replace(), or better yet require it to be one big single-pass strtr() for all such settings, it should have no noticeable performance impact even if we add lots and lots of features like this.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG

Platonides

31 Jul 31 Jul

1:18 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

David Goodman wrote:

...

Which of the preference settings are likely to cause this problem?

The preferences that can -combined- make you unique[1] are: math preference, date format, auto-number headings, user language, image thumbnail size.

Interestingly, the stub threshold supposedly does not affect it [2].

Additionally, being or not in the secure server acts as another preference.

1-Taken from User::getPageRenderingHash() 2-That is what a comment there says, but take it with a grain of salt.

jidanni＠jidanni.org

1 Aug 1 Aug

4:04 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

OK man, you've got me committed to the spirit of using as many default settings as I can stand.

P> math preference, date format, auto-number headings, user language, image P> thumbnail size.

Which brings up the question "Which of (these and) the rest of my settings are not the default?"

Any easy way (for the average non-insider Junior User, not me with the source code) to tell in one click?

"Can I be allowed to examine a list and pick and choose which ones I want to restore the defaults of?"

Or must he hit the Slaughterhouse Button https://bugzilla.wikimedia.org/show_bug.cgi?id=17188

Platonides

10 Aug 10 Aug

12:10 a.m.

New subject: Caching, was: Re: wikipedia is one of the slower sites on the web

On 31/07/10, Platonides wrote:

...

David Goodman wrote:

...
Which of the preference settings are likely to cause this problem?

The preferences that can -combined- make you unique[1] are: math preference, date format, auto-number headings, user language, image thumbnail size.

Interestingly, the stub threshold supposedly does not affect it [2].

Additionally, being or not in the secure server acts as another preference.

I have just commited r70783 which makes you unique only if you are using preferences different than the default *that are actually used* on that page.

So having a non-default math option no longer makes you unique on the thousands of pages without <math> tags.

Reviews and actual numbers of parser cache increase welcome.

Aryeh Gregor

30 Jul 30 Jul

5:45 p.m.

On Fri, Jul 30, 2010 at 2:42 AM, Alex Brollo alex.brollo@gmail.com wrote:

...

Yes, but I presume that a big advantage could come from having a simplified, unique, js-free version of the pages online, completely devoid of "user preferences" to avoid any need to parse it again when uploaded by different users with different preferences profile.

This is exactly what we have when you're logged out. The request goes to a Squid, and it serves a static cached file, no dynamic bits (if it's already cached). When you log in, it can't be static, because we display your name in the upper right, etc.

On Fri, Jul 30, 2010 at 4:49 AM, John Vandenberg jayvdb@gmail.com wrote:

...

Could we add a logged-in-reader mode, for people who are infrequent contributors but wish to be logged in for the prefs.

As soon as you're logged in, you're missing Squid cache, because we have to add your name to the top, attach your user CSS/JS, etc. You can't be served the same HTML as an anonymous user. If you want to be served the same HTML as an anonymous user, log out.

Fortunately, the major slowdown is parser cache misses, not Squid cache misses. To avoid parser cache misses, just make sure you don't change parser-affecting preferences to non-default values. (We don't say which these are, of course . . .)

...

They could be served a slightly old cached version of the page when one is available for their prefs. e.g. if the cached version is less than a minute old.

That would make no difference. If you've fiddled with your preferences nontrivially, there's a good chance that not a single other user has the exact same preferences, so you'll only hit the parser cache if you yourself have viewed the page recently. For instance, if you set your stub threshold to 357 bytes, you'll never hit anyone else's cache (unless someone else has that exact stub threshold). Even if you just fiddle with on/off options, there are several, and the number of combinations is exponential.

Moreover, practically no page changes anywhere close to once per minute. If the threshold is set that low, you'll essentially never get extra parser cache hits. On the other hand, extra infrastructure will be needed to keep around stale parser cache entries, so it's a clear overall loss.

...

The down side is that if they see an error, it may already be fixed. OTOH, if the page is being revised frequently, the same is likely to happen anyway. The text could be stale before it hits the wire due to parsing delay.

However, in that case everyone will see the new contents at more or less the same time -- it won't be inconsistent.

John Vandenberg

7:32 p.m.

On Sat, Jul 31, 2010 at 1:45 AM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:

...

On Fri, Jul 30, 2010 at 4:49 AM, John Vandenberg jayvdb@gmail.com wrote:

...
Could we add a logged-in-reader mode, for people who are infrequent contributors but wish to be logged in for the prefs.

...

Fortunately, the major slowdown is parser cache misses, not Squid cache misses. To avoid parser cache misses, just make sure you don't change parser-affecting preferences to non-default values. (We don't say which these are, of course . . .)

So you're telling my theoretical logged-in-reader to use default prefs, or log out, when the reason they are a logged-in-reader is so they can control their preferences..!

...

...
They could be served a slightly old cached version of the page when one is available for their prefs. e.g. if the cached version is less than a minute old.

That would make no difference. If you've fiddled with your preferences nontrivially, there's a good chance that not a single other user has the exact same preferences, so you'll only hit the parser cache if you yourself have viewed the page recently. For instance, if you set your stub threshold to 357 bytes, you'll never hit anyone else's cache (unless someone else has that exact stub threshold). Even if you just fiddle with on/off options, there are several, and the number of combinations is exponential.

Someone who sets their stub threshold to 357 is their own performance enemy.

Surely there are a few common 'preference sets' which large numbers of readers use?

How many people only look at the front page in the morning, and jump to a few pages from there..?

...

Moreover, practically no page changes anywhere close to once per minute. If the threshold is set that low, you'll essentially never get extra parser cache hits. On the other hand, extra infrastructure will be needed to keep around stale parser cache entries, so it's a clear overall loss.

There are plenty of pages which change more than once per minute, however I'd expect a much higher threshold, variable based on the volume of page activity, or some other mechanism to determine whether the cached version is acceptably stale for the logged-in-reader.

There is no infrastructure required for extra stale entries. If the viewer is happy to accept the slightly stale revision for there chosen prefs, serve it. If not, reparse.

...

...
The down side is that if they see an error, it may already be fixed. OTOH, if the page is being revised frequently, the same is likely to happen anyway. The text could be stale before it hits the wire due to parsing delay.

However, in that case everyone will see the new contents at more or less the same time -- it won't be inconsistent.

Not on frequently changing pages. many edits can occur while I am pulling the page down the wire. I then need to read the page to find this error.

-- John Vandenberg

Aryeh Gregor

1 Aug 1 Aug

8:34 p.m.

On Fri, Jul 30, 2010 at 1:32 PM, John Vandenberg jayvdb@gmail.com wrote:

...

So you're telling my theoretical logged-in-reader to use default prefs, or log out, when the reason they are a logged-in-reader is so they can control their preferences..!

Yep. You want features, you often pay a performance penalty. In this case the performance penalty should be reducible, or at least clearly marked, but that's a general rule anyway.

...

Surely there are a few common 'preference sets' which large numbers of readers use?

Changing any parser-related preference will kill page load times.

...

There are plenty of pages which change more than once per minute,

No pages change once per minute on average. That would be 1440 edits per day, or more than 500,000 per year. Only one page on enwiki (WP:AIAV) has more than 500,000 edits *total*, let alone per year. There were only 18 edits to WP:ANI between 17:00 and 18:00 today, just for example, which is less than one edit every three minutes. There are some times when a particular page changes many times in a minute -- like when a major event occurs and everyone rushes to update an article -- but these are rare and don't last long.

You also seem to be missing how many different possible parser cache keys there are. It's not like there are only five or ten possible versions. As I said before -- if you change your parser-related settings around a bunch, you will probably rarely or never hit parser cache except when you yourself viewed the page since it last changed. There are too many possible permutations of settings here.

...

however I'd expect a much higher threshold, variable based on the volume of page activity, or some other mechanism to determine whether the cached version is acceptably stale for the logged-in-reader.

There is no infrastructure required for extra stale entries. If the viewer is happy to accept the slightly stale revision for there chosen prefs, serve it. If not, reparse.

Look, this is just not a useful solution, period. It would be extremely ineffective. If you extended the permitted staleness level so much that it would be moderately effective, it would be useless, because you'd be seeing hours- or days-old articles. On the other hand, for a comparable amount of effort you could implement a solution that actually is effective, like adding an extra postprocessing stage.

On Fri, Jul 30, 2010 at 8:22 PM, jidanni@jidanni.org wrote:

...

Hmmm, maybe they're there amongst the "!"s below. $ lynx --source http://en.wikipedia.org/wiki/Main_Page | grep parser Expensive parser function count: 44/500

Yes. That key is generated by the following line in includes/parser/ParserCache.php:

$key = wfMemcKey( 'pcache', 'idhash', "{$pageid}-{$renderkey}!{$hash}{$edit}{$printable}" );

The relevant bit of that, for us, is $hash, which is generated by getPageRenderingHash() in includes/User.php:

// stubthreshold is only included below for completeness, // it will always be 0 when this function is called by parsercache.

$confstr = $this->getOption( 'math' ); $confstr .= '!' . $this->getOption( 'stubthreshold' ); if ( $wgUseDynamicDates ) { $confstr .= '!' . $this->getDatePreference(); } $confstr .= '!' . ( $this->getOption( 'numberheadings' ) ? '1' : '' ); $confstr .= '!' . $wgLang->getCode(); $confstr .= '!' . $this->getOption( 'thumbsize' ); // add in language specific options, if any $extra = $wgContLang->getExtraHashOptions(); $confstr .= $extra;

So anonymous users on enwiki have math=3, stubthreshold=0 (although the comment indicates this is irrelevant somehow), date preferences = 'default', numberheadings = 1, language = 'en', thumbsize = 4. Changing any of those from the default will make you miss the parser cache on enwiki.

On Sat, Jul 31, 2010 at 12:58 PM, Daniel Kinzler daniel@brightbyte.de wrote:

...

This is a few years old, but I guess it's still relevant: http://brightbyte.de/page/Client-side_skins_with_XSLT I experimented a bit with ways to do all the per-user preference stuff on the client side, with XSLT.

XSLT seems a bit baroque. If the goal is to use script to avoid cache misses, why not just use plain old JavaScript? A lot more people know it, it supports progressive rendering (does XSLT?), and it's much better supported. In particular, your approach of serving something other than HTML and relying on XSLT support to transform it will seriously confuse text browsers, search engines, etc.

Platonides

10:18 p.m.

Aryeh Gregor wrote:

...

Look, this is just not a useful solution, period. It would be extremely ineffective. If you extended the permitted staleness level so much that it would be moderately effective, it would be useless, because you'd be seeing hours- or days-old articles. On the other hand, for a comparable amount of effort you could implement a solution that actually is effective, like adding an extra postprocessing stage.

Yes, I have some ideas on how to improve it.

...

On Fri, Jul 30, 2010 at 1:32 PM, John Vandenberg jayvdb@gmail.com wrote: Someone who sets their stub threshold to 357 is their own performance enemy.

In fact, setting the stub threshold to anything disables the parser cache. You can only hit it when it is set to 0.

Aryeh, can you do some statistics about the frequency of the different stub thresholds? Perhaps restricted to people which edited this year, to discard unused accounts.

Roan Kattouw

10:43 p.m.

2010/8/1 Platonides Platonides@gmail.com:

...

Aryeh, can you do some statistics about the frequency of the different stub thresholds? Perhaps restricted to people which edited this year, to discard unused accounts.

He can't, but I can. I ran a couple of queries and put the result at http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

Roan Kattouw (Catrope)

Chad

10:46 p.m.

On Sun, Aug 1, 2010 at 1:43 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...

2010/8/1 Platonides Platonides@gmail.com:

...
Aryeh, can you do some statistics about the frequency of the different stub thresholds? Perhaps restricted to people which edited this year, to discard unused accounts.

He can't, but I can. I ran a couple of queries and put the result at http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

Isn't stub threshold a *reading* preference? It wouldn't be unreasonable to assume that someone could have that preference set and not regularly edit.

Also doesn't take into account people who haven't changed their preferences in a long time (and thus aren't in user_props yet)

-Chad

Aryeh Gregor

10:55 p.m.

On Sun, Aug 1, 2010 at 4:43 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...

He can't, but I can. I ran a couple of queries and put the result at http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

I can too -- I'm a toolserver root, so I have read-only access to pretty much the whole database (minus some omitted databases/tables/columns, mainly IP addresses and maybe private wikis). But no need, since you already did it. :) The data isn't complete because not all users have been ported to user_properties, right?

One easy hack to reduce this problem is just to only provide a few options for stub threshold, as we do with thumbnail size. Although this is only useful if we cache pages with nonzero stub threshold . . . why don't we do that? Too much fragmentation due to the excessive range of options?

Roan Kattouw

11:03 p.m.

2010/8/1 Aryeh Gregor Simetrical+wikilist@gmail.com:

...

On Sun, Aug 1, 2010 at 4:43 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...
He can't, but I can. I ran a couple of queries and put the result at http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

I can too -- I'm a toolserver root, so I have read-only access to pretty much the whole database (minus some omitted databases/tables/columns, mainly IP addresses and maybe private wikis).

Ah yes, I forgot about that. I was assuming you'd need access to the live DB for this.

...

But no need, since you already did it. :) The data isn't complete because not all users have been ported to user_properties, right?

I don't know. Cursory inspection seems to indicate user_properties is relatively complete, but comprehensive count queries are too slow for me to dare run them on the cluster. Maybe you could run something along the lines of SELECT COUNT(DISTINCT up_user) FROM user_properties; on the toolserver and compare it with SELECT COUNT(*) FROM user;

...

One easy hack to reduce this problem is just to only provide a few options for stub threshold, as we do with thumbnail size. Although this is only useful if we cache pages with nonzero stub threshold . . . why don't we do that? Too much fragmentation due to the excessive range of options?

Maybe; but the fact that the field is present but set to 0 in the parser cache key is very weird. SVN blame should probably be able to tell who did this and hopefully why.

Roan Kattouw (Catrope)

Platonides

11:48 p.m.

Roan Kattouw wrote:

...

...
One easy hack to reduce this problem is just to only provide a few options for stub threshold, as we do with thumbnail size. Although this is only useful if we cache pages with nonzero stub threshold . . . why don't we do that? Too much fragmentation due to the excessive range of options?

Maybe; but the fact that the field is present but set to 0 in the parser cache key is very weird. SVN blame should probably be able to tell who did this and hopefully why.

Roan Kattouw (Catrope)

Look at Article::getParserOutput() on how $wgUser->getOption( 'stubthreshold' ) is explicitely check that it is 0 before enabling the parser cache. *There are several other entry points to the ParserCache in Article, it's a bit mixed.

Note that we do offer several options, not only the free-text field. I think that the underlying problem is that when changing an article from 98 bytes to 102, we would need to invalidate all pages linking to it for stubthresholds of 100 bytes.

Since the pages are reparsed, custom values are not a problem now. I think that to cache for the stubthresholds, we would need to cache just before the replaceLinkHolders() and perform the replacement at the user request.

Aryeh Gregor

2 Aug 2 Aug

12:24 a.m.

On Sun, Aug 1, 2010 at 5:03 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...

I don't know. Cursory inspection seems to indicate user_properties is relatively complete, but comprehensive count queries are too slow for me to dare run them on the cluster. Maybe you could run something along the lines of SELECT COUNT(DISTINCT up_user) FROM user_properties; on the toolserver and compare it with SELECT COUNT(*) FROM user;

That won't work, because it won't count users whose settings are all default. However, we can tell who's switched because user_options will be empty.

On Sun, Aug 1, 2010 at 5:48 PM, Platonides Platonides@gmail.com wrote:

...

Note that we do offer several options, not only the free-text field. I think that the underlying problem is that when changing an article from 98 bytes to 102, we would need to invalidate all pages linking to it for stubthresholds of 100 bytes.

Aha, that must be it. Any stub threshold would require extra page invalidation, which we don't do because it would be pointlessly expensive. Postprocessing would fix the problem.

...

Since the pages are reparsed, custom values are not a problem now. I think that to cache for the stubthresholds, we would need to cache just before the replaceLinkHolders() and perform the replacement at the user request.

Yep. Or parse further, but leave markers lingering in the output somehow. We don't need to cache the actual wikitext, either way. We just need to cache at some point after all the heavy lifting has been done, and everything that's left can be done in a couple of milliseconds.

Aryeh Gregor

1:32 a.m.

On Sun, Aug 1, 2010 at 6:24 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:

...

That won't work, because it won't count users whose settings are all default. However, we can tell who's switched because user_options will be empty.

SELECT COUNT(*) FROM user WHERE user_options = ''; SELECT COUNT(*) FROM user; +----------+ | COUNT(*) | +----------+ | 3491404 | +----------+ 1 row in set (10 min 20.11 sec)

+----------+ | COUNT(*) | +----------+ | 12822573 | +----------+ 1 row in set (7 min 47.87 sec)

I.e., only about a quarter of users have been ported to user_properties. Why wasn't a conversion script run here?

Domas Mituzas

9:35 a.m.

Hi!

...

I.e., only about a quarter of users have been ported to user_properties. Why wasn't a conversion script run here?

In theory if all properties are at defaults, user shouldn't be there. The actual check should be against the blob field.

Domas

Andrew Garrett

11:22 a.m.

On Mon, Aug 2, 2010 at 5:35 PM, Domas Mituzas midom.lists@gmail.com wrote:

...

Hi!

...
I.e., only about a quarter of users have been ported to user_properties. Why wasn't a conversion script run here?

In theory if all properties are at defaults, user shouldn't be there. The actual check should be against the blob field.

That's what he did. Read the query.

-- Andrew Garrett http://werdn.us/

Domas Mituzas

11:30 a.m.

...

That's what he did. Read the query.

;-) thats what happens when email gets ahead of coffee.

Domas

Lars Aronsson

3 Aug 3 Aug

2:32 a.m.

On 08/01/2010 10:55 PM, Aryeh Gregor wrote:

...

One easy hack to reduce this problem is just to only provide a few options for stub threshold, as we do with thumbnail size. Although this is only useful if we cache pages with nonzero stub threshold . . . why don't we do that? Too much fragmentation due to the excessive range of options?

Couldn't you just tag every internal link with a separate class for the length of the target article, and then use different personal CSS to set the threshold? The generated page would be the same for all users:

<a href="My_Article" class="134_byte_article">My Article</a>

-- Lars Aronsson (lars@aronsson.se) Aronsson Datateknik - http://aronsson.se

Domas Mituzas

12:42 p.m.

Hi!

...

Couldn't you just tag every internal link with a separate class for the length of the target article,

Great idea, how come noone ever came up with this, I even have a stylesheet ready, here it is (do note, even it looks big in text, gzip gets it down to 10% so we can support this kind of granularity even up to a megabyte :)

Domas

a { color: blue } a.1_byte_article { color: red; } a.2_byte_article { color: red; } a.3_byte_article { color: red; } a.4_byte_article { color: red; } a.5_byte_article { color: red; } a.6_byte_article { color: red; } a.7_byte_article { color: red; } a.8_byte_article { color: red; } a.9_byte_article { color: red; } a.10_byte_article { color: red; } a.11_byte_article { color: red; } a.12_byte_article { color: red; } a.13_byte_article { color: red; } a.14_byte_article { color: red; } a.15_byte_article { color: red; } a.16_byte_article { color: red; } a.17_byte_article { color: red; } a.18_byte_article { color: red; } a.19_byte_article { color: red; } a.20_byte_article { color: red; } a.21_byte_article { color: red; } a.22_byte_article { color: red; } a.23_byte_article { color: red; } a.24_byte_article { color: red; } a.25_byte_article { color: red; } a.26_byte_article { color: red; } a.27_byte_article { color: red; } a.28_byte_article { color: red; } a.29_byte_article { color: red; } a.30_byte_article { color: red; } a.31_byte_article { color: red; } a.32_byte_article { color: red; } a.33_byte_article { color: red; } a.34_byte_article { color: red; } a.35_byte_article { color: red; } a.36_byte_article { color: red; } a.37_byte_article { color: red; } a.38_byte_article { color: red; } a.39_byte_article { color: red; } a.40_byte_article { color: red; } a.41_byte_article { color: red; } a.42_byte_article { color: red; } a.43_byte_article { color: red; } a.44_byte_article { color: red; } a.45_byte_article { color: red; } a.46_byte_article { color: red; } a.47_byte_article { color: red; } a.48_byte_article { color: red; } a.49_byte_article { color: red; } a.50_byte_article { color: red; } a.51_byte_article { color: red; } a.52_byte_article { color: red; } a.53_byte_article { color: red; } a.54_byte_article { color: red; } a.55_byte_article { color: red; } a.56_byte_article { color: red; } a.57_byte_article { color: red; } a.58_byte_article { color: red; } a.59_byte_article { color: red; } a.60_byte_article { color: red; } a.61_byte_article { color: red; } a.62_byte_article { color: red; } a.63_byte_article { color: red; } a.64_byte_article { color: red; } a.65_byte_article { color: red; } a.66_byte_article { color: red; } a.67_byte_article { color: red; } a.68_byte_article { color: red; } a.69_byte_article { color: red; } a.70_byte_article { color: red; } a.71_byte_article { color: red; } a.72_byte_article { color: red; } a.73_byte_article { color: red; } a.74_byte_article { color: red; } a.75_byte_article { color: red; } a.76_byte_article { color: red; } a.77_byte_article { color: red; } a.78_byte_article { color: red; } a.79_byte_article { color: red; } a.80_byte_article { color: red; } a.81_byte_article { color: red; } a.82_byte_article { color: red; } a.83_byte_article { color: red; } a.84_byte_article { color: red; } a.85_byte_article { color: red; } a.86_byte_article { color: red; } a.87_byte_article { color: red; } a.88_byte_article { color: red; } a.89_byte_article { color: red; } a.90_byte_article { color: red; } a.91_byte_article { color: red; } a.92_byte_article { color: red; } a.93_byte_article { color: red; } a.94_byte_article { color: red; } a.95_byte_article { color: red; } a.96_byte_article { color: red; } a.97_byte_article { color: red; } a.98_byte_article { color: red; } a.99_byte_article { color: red; } a.100_byte_article { color: red; } a.101_byte_article { color: red; } a.102_byte_article { color: red; } a.103_byte_article { color: red; } a.104_byte_article { color: red; } a.105_byte_article { color: red; } a.106_byte_article { color: red; } a.107_byte_article { color: red; } a.108_byte_article { color: red; } a.109_byte_article { color: red; } a.110_byte_article { color: red; } a.111_byte_article { color: red; } a.112_byte_article { color: red; } a.113_byte_article { color: red; } a.114_byte_article { color: red; } a.115_byte_article { color: red; } a.116_byte_article { color: red; } a.117_byte_article { color: red; } a.118_byte_article { color: red; } a.119_byte_article { color: red; } a.120_byte_article { color: red; } a.121_byte_article { color: red; } a.122_byte_article { color: red; } a.123_byte_article { color: red; } a.124_byte_article { color: red; } a.125_byte_article { color: red; } a.126_byte_article { color: red; } a.127_byte_article { color: red; } a.128_byte_article { color: red; } a.129_byte_article { color: red; } a.130_byte_article { color: red; } a.131_byte_article { color: red; } a.132_byte_article { color: red; } a.133_byte_article { color: red; } a.134_byte_article { color: red; } a.135_byte_article { color: red; } a.136_byte_article { color: red; } a.137_byte_article { color: red; } a.138_byte_article { color: red; } a.139_byte_article { color: red; } a.140_byte_article { color: red; } a.141_byte_article { color: red; } a.142_byte_article { color: red; } a.143_byte_article { color: red; } a.144_byte_article { color: red; } a.145_byte_article { color: red; } a.146_byte_article { color: red; } a.147_byte_article { color: red; } a.148_byte_article { color: red; } a.149_byte_article { color: red; } a.150_byte_article { color: red; } a.151_byte_article { color: red; } a.152_byte_article { color: red; } a.153_byte_article { color: red; } a.154_byte_article { color: red; } a.155_byte_article { color: red; } a.156_byte_article { color: red; } a.157_byte_article { color: red; } a.158_byte_article { color: red; } a.159_byte_article { color: red; } a.160_byte_article { color: red; } a.161_byte_article { color: red; } a.162_byte_article { color: red; } a.163_byte_article { color: red; } a.164_byte_article { color: red; } a.165_byte_article { color: red; } a.166_byte_article { color: red; } a.167_byte_article { color: red; } a.168_byte_article { color: red; } a.169_byte_article { color: red; } a.170_byte_article { color: red; } a.171_byte_article { color: red; } a.172_byte_article { color: red; } a.173_byte_article { color: red; } a.174_byte_article { color: red; } a.175_byte_article { color: red; } a.176_byte_article { color: red; } a.177_byte_article { color: red; } a.178_byte_article { color: red; } a.179_byte_article { color: red; } a.180_byte_article { color: red; } a.181_byte_article { color: red; } a.182_byte_article { color: red; } a.183_byte_article { color: red; } a.184_byte_article { color: red; } a.185_byte_article { color: red; } a.186_byte_article { color: red; } a.187_byte_article { color: red; } a.188_byte_article { color: red; } a.189_byte_article { color: red; } a.190_byte_article { color: red; } a.191_byte_article { color: red; } a.192_byte_article { color: red; } a.193_byte_article { color: red; } a.194_byte_article { color: red; } a.195_byte_article { color: red; } a.196_byte_article { color: red; } a.197_byte_article { color: red; } a.198_byte_article { color: red; } a.199_byte_article { color: red; } a.200_byte_article { color: red; } a.201_byte_article { color: red; } a.202_byte_article { color: red; } a.203_byte_article { color: red; } a.204_byte_article { color: red; } a.205_byte_article { color: red; } a.206_byte_article { color: red; } a.207_byte_article { color: red; } a.208_byte_article { color: red; } a.209_byte_article { color: red; } a.210_byte_article { color: red; } a.211_byte_article { color: red; } a.212_byte_article { color: red; } a.213_byte_article { color: red; } a.214_byte_article { color: red; } a.215_byte_article { color: red; } a.216_byte_article { color: red; } a.217_byte_article { color: red; } a.218_byte_article { color: red; } a.219_byte_article { color: red; } a.220_byte_article { color: red; } a.221_byte_article { color: red; } a.222_byte_article { color: red; } a.223_byte_article { color: red; } a.224_byte_article { color: red; } a.225_byte_article { color: red; } a.226_byte_article { color: red; } a.227_byte_article { color: red; } a.228_byte_article { color: red; } a.229_byte_article { color: red; } a.230_byte_article { color: red; } a.231_byte_article { color: red; } a.232_byte_article { color: red; } a.233_byte_article { color: red; } a.234_byte_article { color: red; } a.235_byte_article { color: red; } a.236_byte_article { color: red; } a.237_byte_article { color: red; } a.238_byte_article { color: red; } a.239_byte_article { color: red; } a.240_byte_article { color: red; } a.241_byte_article { color: red; } a.242_byte_article { color: red; } a.243_byte_article { color: red; } a.244_byte_article { color: red; } a.245_byte_article { color: red; } a.246_byte_article { color: red; } a.247_byte_article { color: red; } a.248_byte_article { color: red; } a.249_byte_article { color: red; } a.250_byte_article { color: red; } a.251_byte_article { color: red; } a.252_byte_article { color: red; } a.253_byte_article { color: red; } a.254_byte_article { color: red; } a.255_byte_article { color: red; } a.256_byte_article { color: red; } a.257_byte_article { color: red; } a.258_byte_article { color: red; } a.259_byte_article { color: red; } a.260_byte_article { color: red; } a.261_byte_article { color: red; } a.262_byte_article { color: red; } a.263_byte_article { color: red; } a.264_byte_article { color: red; } a.265_byte_article { color: red; } a.266_byte_article { color: red; } a.267_byte_article { color: red; } a.268_byte_article { color: red; } a.269_byte_article { color: red; } a.270_byte_article { color: red; } a.271_byte_article { color: red; } a.272_byte_article { color: red; } a.273_byte_article { color: red; } a.274_byte_article { color: red; } a.275_byte_article { color: red; } a.276_byte_article { color: red; } a.277_byte_article { color: red; } a.278_byte_article { color: red; } a.279_byte_article { color: red; } a.280_byte_article { color: red; } a.281_byte_article { color: red; } a.282_byte_article { color: red; } a.283_byte_article { color: red; } a.284_byte_article { color: red; } a.285_byte_article { color: red; } a.286_byte_article { color: red; } a.287_byte_article { color: red; } a.288_byte_article { color: red; } a.289_byte_article { color: red; } a.290_byte_article { color: red; } a.291_byte_article { color: red; } a.292_byte_article { color: red; } a.293_byte_article { color: red; } a.294_byte_article { color: red; } a.295_byte_article { color: red; } a.296_byte_article { color: red; } a.297_byte_article { color: red; } a.298_byte_article { color: red; } a.299_byte_article { color: red; } a.300_byte_article { color: red; } a.301_byte_article { color: red; } a.302_byte_article { color: red; } a.303_byte_article { color: red; } a.304_byte_article { color: red; } a.305_byte_article { color: red; } a.306_byte_article { color: red; } a.307_byte_article { color: red; } a.308_byte_article { color: red; } a.309_byte_article { color: red; } a.310_byte_article { color: red; } a.311_byte_article { color: red; } a.312_byte_article { color: red; } a.313_byte_article { color: red; } a.314_byte_article { color: red; } a.315_byte_article { color: red; } a.316_byte_article { color: red; } a.317_byte_article { color: red; } a.318_byte_article { color: red; } a.319_byte_article { color: red; } a.320_byte_article { color: red; } a.321_byte_article { color: red; } a.322_byte_article { color: red; } a.323_byte_article { color: red; } a.324_byte_article { color: red; } a.325_byte_article { color: red; } a.326_byte_article { color: red; } a.327_byte_article { color: red; } a.328_byte_article { color: red; } a.329_byte_article { color: red; } a.330_byte_article { color: red; } a.331_byte_article { color: red; } a.332_byte_article { color: red; } a.333_byte_article { color: red; } a.334_byte_article { color: red; } a.335_byte_article { color: red; } a.336_byte_article { color: red; } a.337_byte_article { color: red; } a.338_byte_article { color: red; } a.339_byte_article { color: red; } a.340_byte_article { color: red; } a.341_byte_article { color: red; } a.342_byte_article { color: red; } a.343_byte_article { color: red; } a.344_byte_article { color: red; } a.345_byte_article { color: red; } a.346_byte_article { color: red; } a.347_byte_article { color: red; } a.348_byte_article { color: red; } a.349_byte_article { color: red; } a.350_byte_article { color: red; } a.351_byte_article { color: red; } a.352_byte_article { color: red; } a.353_byte_article { color: red; } a.354_byte_article { color: red; } a.355_byte_article { color: red; } a.356_byte_article { color: red; } a.357_byte_article { color: red; } a.358_byte_article { color: red; } a.359_byte_article { color: red; } a.360_byte_article { color: red; } a.361_byte_article { color: red; } a.362_byte_article { color: red; } a.363_byte_article { color: red; } a.364_byte_article { color: red; } a.365_byte_article { color: red; } a.366_byte_article { color: red; } a.367_byte_article { color: red; } a.368_byte_article { color: red; } a.369_byte_article { color: red; } a.370_byte_article { color: red; } a.371_byte_article { color: red; } a.372_byte_article { color: red; } a.373_byte_article { color: red; } a.374_byte_article { color: red; } a.375_byte_article { color: red; } a.376_byte_article { color: red; } a.377_byte_article { color: red; } a.378_byte_article { color: red; } a.379_byte_article { color: red; } a.380_byte_article { color: red; } a.381_byte_article { color: red; } a.382_byte_article { color: red; } a.383_byte_article { color: red; } a.384_byte_article { color: red; } a.385_byte_article { color: red; } a.386_byte_article { color: red; } a.387_byte_article { color: red; } a.388_byte_article { color: red; } a.389_byte_article { color: red; } a.390_byte_article { color: red; } a.391_byte_article { color: red; } a.392_byte_article { color: red; } a.393_byte_article { color: red; } a.394_byte_article { color: red; } a.395_byte_article { color: red; } a.396_byte_article { color: red; } a.397_byte_article { color: red; } a.398_byte_article { color: red; } a.399_byte_article { color: red; } a.400_byte_article { color: red; } a.401_byte_article { color: red; } a.402_byte_article { color: red; } a.403_byte_article { color: red; } a.404_byte_article { color: red; } a.405_byte_article { color: red; } a.406_byte_article { color: red; } a.407_byte_article { color: red; } a.408_byte_article { color: red; } a.409_byte_article { color: red; } a.410_byte_article { color: red; } a.411_byte_article { color: red; } a.412_byte_article { color: red; } a.413_byte_article { color: red; } a.414_byte_article { color: red; } a.415_byte_article { color: red; } a.416_byte_article { color: red; } a.417_byte_article { color: red; } a.418_byte_article { color: red; } a.419_byte_article { color: red; } a.420_byte_article { color: red; } a.421_byte_article { color: red; } a.422_byte_article { color: red; } a.423_byte_article { color: red; } a.424_byte_article { color: red; } a.425_byte_article { color: red; } a.426_byte_article { color: red; } a.427_byte_article { color: red; } a.428_byte_article { color: red; } a.429_byte_article { color: red; } a.430_byte_article { color: red; } a.431_byte_article { color: red; } a.432_byte_article { color: red; } a.433_byte_article { color: red; } a.434_byte_article { color: red; } a.435_byte_article { color: red; } a.436_byte_article { color: red; } a.437_byte_article { color: red; } a.438_byte_article { color: red; } a.439_byte_article { color: red; } a.440_byte_article { color: red; } a.441_byte_article { color: red; } a.442_byte_article { color: red; } a.443_byte_article { color: red; } a.444_byte_article { color: red; } a.445_byte_article { color: red; } a.446_byte_article { color: red; } a.447_byte_article { color: red; } a.448_byte_article { color: red; } a.449_byte_article { color: red; } a.450_byte_article { color: red; } a.451_byte_article { color: red; } a.452_byte_article { color: red; } a.453_byte_article { color: red; } a.454_byte_article { color: red; } a.455_byte_article { color: red; } a.456_byte_article { color: red; } a.457_byte_article { color: red; } a.458_byte_article { color: red; } a.459_byte_article { color: red; } a.460_byte_article { color: red; } a.461_byte_article { color: red; } a.462_byte_article { color: red; } a.463_byte_article { color: red; } a.464_byte_article { color: red; } a.465_byte_article { color: red; } a.466_byte_article { color: red; } a.467_byte_article { color: red; } a.468_byte_article { color: red; } a.469_byte_article { color: red; } a.470_byte_article { color: red; } a.471_byte_article { color: red; } a.472_byte_article { color: red; } a.473_byte_article { color: red; } a.474_byte_article { color: red; } a.475_byte_article { color: red; } a.476_byte_article { color: red; } a.477_byte_article { color: red; } a.478_byte_article { color: red; } a.479_byte_article { color: red; } a.480_byte_article { color: red; } a.481_byte_article { color: red; } a.482_byte_article { color: red; } a.483_byte_article { color: red; } a.484_byte_article { color: red; } a.485_byte_article { color: red; } a.486_byte_article { color: red; } a.487_byte_article { color: red; } a.488_byte_article { color: red; } a.489_byte_article { color: red; } a.490_byte_article { color: red; } a.491_byte_article { color: red; } a.492_byte_article { color: red; } a.493_byte_article { color: red; } a.494_byte_article { color: red; } a.495_byte_article { color: red; } a.496_byte_article { color: red; } a.497_byte_article { color: red; } a.498_byte_article { color: red; } a.499_byte_article { color: red; } a.500_byte_article { color: red; } a.501_byte_article { color: red; } a.502_byte_article { color: red; } a.503_byte_article { color: red; } a.504_byte_article { color: red; } a.505_byte_article { color: red; } a.506_byte_article { color: red; } a.507_byte_article { color: red; } a.508_byte_article { color: red; } a.509_byte_article { color: red; } a.510_byte_article { color: red; } a.511_byte_article { color: red; } a.512_byte_article { color: red; } a.513_byte_article { color: red; } a.514_byte_article { color: red; } a.515_byte_article { color: red; } a.516_byte_article { color: red; } a.517_byte_article { color: red; } a.518_byte_article { color: red; } a.519_byte_article { color: red; } a.520_byte_article { color: red; } a.521_byte_article { color: red; } a.522_byte_article { color: red; } a.523_byte_article { color: red; } a.524_byte_article { color: red; } a.525_byte_article { color: red; } a.526_byte_article { color: red; } a.527_byte_article { color: red; } a.528_byte_article { color: red; } a.529_byte_article { color: red; } a.530_byte_article { color: red; } a.531_byte_article { color: red; } a.532_byte_article { color: red; } a.533_byte_article { color: red; } a.534_byte_article { color: red; } a.535_byte_article { color: red; } a.536_byte_article { color: red; } a.537_byte_article { color: red; } a.538_byte_article { color: red; } a.539_byte_article { color: red; } a.540_byte_article { color: red; } a.541_byte_article { color: red; } a.542_byte_article { color: red; } a.543_byte_article { color: red; } a.544_byte_article { color: red; } a.545_byte_article { color: red; } a.546_byte_article { color: red; } a.547_byte_article { color: red; } a.548_byte_article { color: red; } a.549_byte_article { color: red; } a.550_byte_article { color: red; } a.551_byte_article { color: red; } a.552_byte_article { color: red; } a.553_byte_article { color: red; } a.554_byte_article { color: red; } a.555_byte_article { color: red; } a.556_byte_article { color: red; } a.557_byte_article { color: red; } a.558_byte_article { color: red; } a.559_byte_article { color: red; } a.560_byte_article { color: red; } a.561_byte_article { color: red; } a.562_byte_article { color: red; } a.563_byte_article { color: red; } a.564_byte_article { color: red; } a.565_byte_article { color: red; } a.566_byte_article { color: red; } a.567_byte_article { color: red; } a.568_byte_article { color: red; } a.569_byte_article { color: red; } a.570_byte_article { color: red; } a.571_byte_article { color: red; } a.572_byte_article { color: red; } a.573_byte_article { color: red; } a.574_byte_article { color: red; } a.575_byte_article { color: red; } a.576_byte_article { color: red; } a.577_byte_article { color: red; } a.578_byte_article { color: red; } a.579_byte_article { color: red; } a.580_byte_article { color: red; } a.581_byte_article { color: red; } a.582_byte_article { color: red; } a.583_byte_article { color: red; } a.584_byte_article { color: red; } a.585_byte_article { color: red; } a.586_byte_article { color: red; } a.587_byte_article { color: red; } a.588_byte_article { color: red; } a.589_byte_article { color: red; } a.590_byte_article { color: red; } a.591_byte_article { color: red; } a.592_byte_article { color: red; } a.593_byte_article { color: red; } a.594_byte_article { color: red; } a.595_byte_article { color: red; } a.596_byte_article { color: red; } a.597_byte_article { color: red; } a.598_byte_article { color: red; } a.599_byte_article { color: red; } a.600_byte_article { color: red; } a.601_byte_article { color: red; } a.602_byte_article { color: red; } a.603_byte_article { color: red; } a.604_byte_article { color: red; } a.605_byte_article { color: red; } a.606_byte_article { color: red; } a.607_byte_article { color: red; } a.608_byte_article { color: red; } a.609_byte_article { color: red; } a.610_byte_article { color: red; } a.611_byte_article { color: red; } a.612_byte_article { color: red; } a.613_byte_article { color: red; } a.614_byte_article { color: red; } a.615_byte_article { color: red; } a.616_byte_article { color: red; } a.617_byte_article { color: red; } a.618_byte_article { color: red; } a.619_byte_article { color: red; } a.620_byte_article { color: red; } a.621_byte_article { color: red; } a.622_byte_article { color: red; } a.623_byte_article { color: red; } a.624_byte_article { color: red; } a.625_byte_article { color: red; } a.626_byte_article { color: red; } a.627_byte_article { color: red; } a.628_byte_article { color: red; } a.629_byte_article { color: red; } a.630_byte_article { color: red; } a.631_byte_article { color: red; } a.632_byte_article { color: red; } a.633_byte_article { color: red; } a.634_byte_article { color: red; } a.635_byte_article { color: red; } a.636_byte_article { color: red; } a.637_byte_article { color: red; } a.638_byte_article { color: red; } a.639_byte_article { color: red; } a.640_byte_article { color: red; } a.641_byte_article { color: red; } a.642_byte_article { color: red; } a.643_byte_article { color: red; } a.644_byte_article { color: red; } a.645_byte_article { color: red; } a.646_byte_article { color: red; } a.647_byte_article { color: red; } a.648_byte_article { color: red; } a.649_byte_article { color: red; } a.650_byte_article { color: red; } a.651_byte_article { color: red; } a.652_byte_article { color: red; } a.653_byte_article { color: red; } a.654_byte_article { color: red; } a.655_byte_article { color: red; } a.656_byte_article { color: red; } a.657_byte_article { color: red; } a.658_byte_article { color: red; } a.659_byte_article { color: red; } a.660_byte_article { color: red; } a.661_byte_article { color: red; } a.662_byte_article { color: red; } a.663_byte_article { color: red; } a.664_byte_article { color: red; } a.665_byte_article { color: red; } a.666_byte_article { color: red; } a.667_byte_article { color: red; } a.668_byte_article { color: red; } a.669_byte_article { color: red; } a.670_byte_article { color: red; } a.671_byte_article { color: red; } a.672_byte_article { color: red; } a.673_byte_article { color: red; } a.674_byte_article { color: red; } a.675_byte_article { color: red; } a.676_byte_article { color: red; } a.677_byte_article { color: red; } a.678_byte_article { color: red; } a.679_byte_article { color: red; } a.680_byte_article { color: red; } a.681_byte_article { color: red; } a.682_byte_article { color: red; } a.683_byte_article { color: red; } a.684_byte_article { color: red; } a.685_byte_article { color: red; } a.686_byte_article { color: red; } a.687_byte_article { color: red; } a.688_byte_article { color: red; } a.689_byte_article { color: red; } a.690_byte_article { color: red; } a.691_byte_article { color: red; } a.692_byte_article { color: red; } a.693_byte_article { color: red; } a.694_byte_article { color: red; } a.695_byte_article { color: red; } a.696_byte_article { color: red; } a.697_byte_article { color: red; } a.698_byte_article { color: red; } a.699_byte_article { color: red; } a.700_byte_article { color: red; } a.701_byte_article { color: red; } a.702_byte_article { color: red; } a.703_byte_article { color: red; } a.704_byte_article { color: red; } a.705_byte_article { color: red; } a.706_byte_article { color: red; } a.707_byte_article { color: red; } a.708_byte_article { color: red; } a.709_byte_article { color: red; } a.710_byte_article { color: red; } a.711_byte_article { color: red; } a.712_byte_article { color: red; } a.713_byte_article { color: red; } a.714_byte_article { color: red; } a.715_byte_article { color: red; } a.716_byte_article { color: red; } a.717_byte_article { color: red; } a.718_byte_article { color: red; } a.719_byte_article { color: red; } a.720_byte_article { color: red; } a.721_byte_article { color: red; } a.722_byte_article { color: red; } a.723_byte_article { color: red; } a.724_byte_article { color: red; } a.725_byte_article { color: red; } a.726_byte_article { color: red; } a.727_byte_article { color: red; } a.728_byte_article { color: red; } a.729_byte_article { color: red; } a.730_byte_article { color: red; } a.731_byte_article { color: red; } a.732_byte_article { color: red; } a.733_byte_article { color: red; } a.734_byte_article { color: red; } a.735_byte_article { color: red; } a.736_byte_article { color: red; } a.737_byte_article { color: red; } a.738_byte_article { color: red; } a.739_byte_article { color: red; } a.740_byte_article { color: red; } a.741_byte_article { color: red; } a.742_byte_article { color: red; } a.743_byte_article { color: red; } a.744_byte_article { color: red; } a.745_byte_article { color: red; } a.746_byte_article { color: red; } a.747_byte_article { color: red; } a.748_byte_article { color: red; } a.749_byte_article { color: red; } a.750_byte_article { color: red; } a.751_byte_article { color: red; } a.752_byte_article { color: red; } a.753_byte_article { color: red; } a.754_byte_article { color: red; } a.755_byte_article { color: red; } a.756_byte_article { color: red; } a.757_byte_article { color: red; } a.758_byte_article { color: red; } a.759_byte_article { color: red; } a.760_byte_article { color: red; } a.761_byte_article { color: red; } a.762_byte_article { color: red; } a.763_byte_article { color: red; } a.764_byte_article { color: red; } a.765_byte_article { color: red; } a.766_byte_article { color: red; } a.767_byte_article { color: red; } a.768_byte_article { color: red; } a.769_byte_article { color: red; } a.770_byte_article { color: red; } a.771_byte_article { color: red; } a.772_byte_article { color: red; } a.773_byte_article { color: red; } a.774_byte_article { color: red; } a.775_byte_article { color: red; } a.776_byte_article { color: red; } a.777_byte_article { color: red; } a.778_byte_article { color: red; } a.779_byte_article { color: red; } a.780_byte_article { color: red; } a.781_byte_article { color: red; } a.782_byte_article { color: red; } a.783_byte_article { color: red; } a.784_byte_article { color: red; } a.785_byte_article { color: red; } a.786_byte_article { color: red; } a.787_byte_article { color: red; } a.788_byte_article { color: red; } a.789_byte_article { color: red; } a.790_byte_article { color: red; } a.791_byte_article { color: red; } a.792_byte_article { color: red; } a.793_byte_article { color: red; } a.794_byte_article { color: red; } a.795_byte_article { color: red; } a.796_byte_article { color: red; } a.797_byte_article { color: red; } a.798_byte_article { color: red; } a.799_byte_article { color: red; } a.800_byte_article { color: red; } a.801_byte_article { color: red; } a.802_byte_article { color: red; } a.803_byte_article { color: red; } a.804_byte_article { color: red; } a.805_byte_article { color: red; } a.806_byte_article { color: red; } a.807_byte_article { color: red; } a.808_byte_article { color: red; } a.809_byte_article { color: red; } a.810_byte_article { color: red; } a.811_byte_article { color: red; } a.812_byte_article { color: red; } a.813_byte_article { color: red; } a.814_byte_article { color: red; } a.815_byte_article { color: red; } a.816_byte_article { color: red; } a.817_byte_article { color: red; } a.818_byte_article { color: red; } a.819_byte_article { color: red; } a.820_byte_article { color: red; } a.821_byte_article { color: red; } a.822_byte_article { color: red; } a.823_byte_article { color: red; } a.824_byte_article { color: red; } a.825_byte_article { color: red; } a.826_byte_article { color: red; } a.827_byte_article { color: red; } a.828_byte_article { color: red; } a.829_byte_article { color: red; } a.830_byte_article { color: red; } a.831_byte_article { color: red; } a.832_byte_article { color: red; } a.833_byte_article { color: red; } a.834_byte_article { color: red; } a.835_byte_article { color: red; } a.836_byte_article { color: red; } a.837_byte_article { color: red; } a.838_byte_article { color: red; } a.839_byte_article { color: red; } a.840_byte_article { color: red; } a.841_byte_article { color: red; } a.842_byte_article { color: red; } a.843_byte_article { color: red; } a.844_byte_article { color: red; } a.845_byte_article { color: red; } a.846_byte_article { color: red; } a.847_byte_article { color: red; } a.848_byte_article { color: red; } a.849_byte_article { color: red; } a.850_byte_article { color: red; } a.851_byte_article { color: red; } a.852_byte_article { color: red; } a.853_byte_article { color: red; } a.854_byte_article { color: red; } a.855_byte_article { color: red; } a.856_byte_article { color: red; } a.857_byte_article { color: red; } a.858_byte_article { color: red; } a.859_byte_article { color: red; } a.860_byte_article { color: red; } a.861_byte_article { color: red; } a.862_byte_article { color: red; } a.863_byte_article { color: red; } a.864_byte_article { color: red; } a.865_byte_article { color: red; } a.866_byte_article { color: red; } a.867_byte_article { color: red; } a.868_byte_article { color: red; } a.869_byte_article { color: red; } a.870_byte_article { color: red; } a.871_byte_article { color: red; } a.872_byte_article { color: red; } a.873_byte_article { color: red; } a.874_byte_article { color: red; } a.875_byte_article { color: red; } a.876_byte_article { color: red; } a.877_byte_article { color: red; } a.878_byte_article { color: red; } a.879_byte_article { color: red; } a.880_byte_article { color: red; } a.881_byte_article { color: red; } a.882_byte_article { color: red; } a.883_byte_article { color: red; } a.884_byte_article { color: red; } a.885_byte_article { color: red; } a.886_byte_article { color: red; } a.887_byte_article { color: red; } a.888_byte_article { color: red; } a.889_byte_article { color: red; } a.890_byte_article { color: red; } a.891_byte_article { color: red; } a.892_byte_article { color: red; } a.893_byte_article { color: red; } a.894_byte_article { color: red; } a.895_byte_article { color: red; } a.896_byte_article { color: red; } a.897_byte_article { color: red; } a.898_byte_article { color: red; } a.899_byte_article { color: red; } a.900_byte_article { color: red; } a.901_byte_article { color: red; } a.902_byte_article { color: red; } a.903_byte_article { color: red; } a.904_byte_article { color: red; } a.905_byte_article { color: red; } a.906_byte_article { color: red; } a.907_byte_article { color: red; } a.908_byte_article { color: red; } a.909_byte_article { color: red; } a.910_byte_article { color: red; } a.911_byte_article { color: red; } a.912_byte_article { color: red; } a.913_byte_article { color: red; } a.914_byte_article { color: red; } a.915_byte_article { color: red; } a.916_byte_article { color: red; } a.917_byte_article { color: red; } a.918_byte_article { color: red; } a.919_byte_article { color: red; } a.920_byte_article { color: red; } a.921_byte_article { color: red; } a.922_byte_article { color: red; } a.923_byte_article { color: red; } a.924_byte_article { color: red; } a.925_byte_article { color: red; } a.926_byte_article { color: red; } a.927_byte_article { color: red; } a.928_byte_article { color: red; } a.929_byte_article { color: red; } a.930_byte_article { color: red; } a.931_byte_article { color: red; } a.932_byte_article { color: red; } a.933_byte_article { color: red; } a.934_byte_article { color: red; } a.935_byte_article { color: red; } a.936_byte_article { color: red; } a.937_byte_article { color: red; } a.938_byte_article { color: red; } a.939_byte_article { color: red; } a.940_byte_article { color: red; } a.941_byte_article { color: red; } a.942_byte_article { color: red; } a.943_byte_article { color: red; } a.944_byte_article { color: red; } a.945_byte_article { color: red; } a.946_byte_article { color: red; } a.947_byte_article { color: red; } a.948_byte_article { color: red; } a.949_byte_article { color: red; } a.950_byte_article { color: red; } a.951_byte_article { color: red; } a.952_byte_article { color: red; } a.953_byte_article { color: red; } a.954_byte_article { color: red; } a.955_byte_article { color: red; } a.956_byte_article { color: red; } a.957_byte_article { color: red; } a.958_byte_article { color: red; } a.959_byte_article { color: red; } a.960_byte_article { color: red; } a.961_byte_article { color: red; } a.962_byte_article { color: red; } a.963_byte_article { color: red; } a.964_byte_article { color: red; } a.965_byte_article { color: red; } a.966_byte_article { color: red; } a.967_byte_article { color: red; } a.968_byte_article { color: red; } a.969_byte_article { color: red; } a.970_byte_article { color: red; } a.971_byte_article { color: red; } a.972_byte_article { color: red; } a.973_byte_article { color: red; } a.974_byte_article { color: red; } a.975_byte_article { color: red; } a.976_byte_article { color: red; } a.977_byte_article { color: red; } a.978_byte_article { color: red; } a.979_byte_article { color: red; } a.980_byte_article { color: red; } a.981_byte_article { color: red; } a.982_byte_article { color: red; } a.983_byte_article { color: red; } a.984_byte_article { color: red; } a.985_byte_article { color: red; } a.986_byte_article { color: red; } a.987_byte_article { color: red; } a.988_byte_article { color: red; } a.989_byte_article { color: red; } a.990_byte_article { color: red; } a.991_byte_article { color: red; } a.992_byte_article { color: red; } a.993_byte_article { color: red; } a.994_byte_article { color: red; } a.995_byte_article { color: red; } a.996_byte_article { color: red; } a.997_byte_article { color: red; } a.998_byte_article { color: red; } a.999_byte_article { color: red; }

K. Peachey

12:55 p.m.

Would something like what is shown below get it even further down?

a { color: blue } a.1_byte_article, a.2_byte_article, a.3_byte_article, a.4_byte_article, a.5_byte_article, a.6_byte_article, a.7_byte_article, a.8_byte_article, a.9_byte_article, a.10_byte_article,a.11_byte_article, a.12_byte_article, a.13_byte_article, a.14_byte_article, a.15_byte_article, a.16_byte_article, a.17_byte_article, a.18_byte_article, a.19_byte_article, a.20_byte_article, a.21_byte_article, a.22_byte_article, a.23_byte_article, a.24_byte_article, a.25_byte_article, a.26_byte_article, a.27_byte_article, a.28_byte_article, a.29_byte_article, a.30_byte_article, a.31_byte_article, a.32_byte_article, a.33_byte_article, a.34_byte_article, a.35_byte_article, a.36_byte_article, a.37_byte_article, a.38_byte_article, a.39_byte_article, a.40_byte_article, a.41_byte_article, a.42_byte_article, a.43_byte_article, a.44_byte_article, a.45_byte_article, a.46_byte_article, a.47_byte_article, a.48_byte_article, a.49_byte_article, a.50_byte_article, a.51_byte_article, a.52_byte_article, a.53_byte_article, a.54_byte_article, a.55_byte_article, a.56_byte_article, a.57_byte_article, a.58_byte_article, a.59_byte_article, a.60_byte_article, a.61_byte_article, a.62_byte_article, a.63_byte_article, a.64_byte_article, a.65_byte_article, a.66_byte_article, a.67_byte_article, a.68_byte_article, a.69_byte_article, a.70_byte_article, a.71_byte_article, a.72_byte_article, a.73_byte_article, a.74_byte_article, a.75_byte_article, a.76_byte_article, a.77_byte_article, a.78_byte_article, a.79_byte_article, a.80_byte_article, a.81_byte_article, a.82_byte_article, a.83_byte_article, a.84_byte_article, a.85_byte_article, a.86_byte_article, a.87_byte_article, a.88_byte_article, a.89_byte_article, a.90_byte_article, a.91_byte_article, a.92_byte_article, a.93_byte_article, a.94_byte_article, a.95_byte_article { color: red }

John Vandenberg

1:24 p.m.

On Tue, Aug 3, 2010 at 8:55 PM, K. Peachey p858snake@yahoo.com.au wrote:

...

Would something like what is shown below get it even further down?

a { color: blue } a.1_byte_article, a.2_byte_article, a.3_byte_article, ...

using an abbreviation like <x>ba would also help.

Limiting the user pref to intervals of 10 bytes would also help.

Also, as this piece of CSS is being dynamically generated,it only needs to include the variations that occur in the body of the article's HTML.

Or the CSS can be generated by JS on the client side, which is what Aryeh has been suggesting all along (I think).

btw, I thought Domas was kidding. I got a chuckle out of it, at least.

-- John Vandenberg

Liangent

12:54 p.m.

On 8/3/10, Lars Aronsson lars@aronsson.se wrote:

...

Couldn't you just tag every internal link with a separate class for the length of the target article, and then use different personal CSS to set the threshold? The generated page would be the same for all users:

So if a page is changed, all pages linking to it need to be parsed again. Will this cost even more?

Platonides

1:55 p.m.

Lars Aronsson wrote:

...

On 08/01/2010 10:55 PM, Aryeh Gregor wrote:

...
One easy hack to reduce this problem is just to only provide a few options for stub threshold, as we do with thumbnail size. Although this is only useful if we cache pages with nonzero stub threshold . . . why don't we do that? Too much fragmentation due to the excessive range of options?

Couldn't you just tag every internal link with a separate class for the length of the target article, and then use different personal CSS to set the threshold? The generated page would be the same for all users:

<a href="My_Article" class="134_byte_article">My Article</a>

That would be workable, eg. one class for articles smaller than 50 bytes, other for 100, 200, 250, 300, 400, 500, 600, 700, 800, 1000, 2000, 2500, 5000, 10000 if it weren't for having to update all those classes whenever the page changes.

It would work to add it as a separate stylesheet for stubs, though.

Aryeh Gregor

5:01 p.m.

On Mon, Aug 2, 2010 at 8:32 PM, Lars Aronsson lars@aronsson.se wrote:

...

Couldn't you just tag every internal link with a separate class for the length of the target article, and then use different personal CSS to set the threshold? The generated page would be the same for all users:

<a href="My_Article" class="134_byte_article">My Article</a>

Until the page changes length. That would force all articles that link to it to be reparsed, unless we use some way to insert the correct page lengths into a parsed page before serving it to the user. In which case we don't really need to do this, we can just insert the stub class on the correct pages using the same mechanism.

Platonides

1 Aug 1 Aug

11:58 p.m.

Roan Kattouw wrote:

...

2010/8/1 Platonides:

...
Aryeh, can you do some statistics about the frequency of the different stub thresholds? Perhaps restricted to people which edited this year, to discard unused accounts.

He can't, but I can. I ran a couple of queries and put the result at http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

Roan Kattouw (Catrope)

Thanks, Roan. I think that the condition should have been the inverse (users with recent edits, not users which don't have old edits) but anyway it shows that with a few (8-10) values we could please almost everyone.

Also, it shows that people don't understand how to disable it. The tail has many extremely large values which can only mean "don't treat stubs different".

Roan Kattouw

2 Aug 2 Aug

12:51 a.m.

2010/8/1 Platonides Platonides@gmail.com:

...

I think that the condition should have been the inverse (users with recent edits, not users which don't have old edits)

Oops. I thought I had reversed the condition correctly, but as you point out I hadn't. I'll run the corrected queries tomorrow.

Roan Kattouw (Catrope)

jidanni＠jidanni.org

31 Jul 31 Jul

2:22 a.m.

...

...
...
...
...
"AG" == Aryeh Gregor Simetrical+wikilist@gmail.com writes:

AG> Fortunately, the major slowdown is parser cache misses, not Squid AG> cache misses. To avoid parser cache misses, just make sure you don't AG> change parser-affecting preferences to non-default values. (We don't AG> say which these are, of course . . .)

Hmmm, maybe they're there amongst the "!"s below. $ lynx --source http://en.wikipedia.org/wiki/Main_Page | grep parser Expensive parser function count: 44/500

Daniel Kinzler

6:58 p.m.

Aryeh Gregor schrieb:

...

As soon as you're logged in, you're missing Squid cache, because we have to add your name to the top, attach your user CSS/JS, etc. You can't be served the same HTML as an anonymous user. If you want to be served the same HTML as an anonymous user, log out.

This is a few years old, but I guess it's still relevant: http://brightbyte.de/page/Client-side_skins_with_XSLT I experimented a bit with ways to do all the per-user preference stuff on the client side, with XSLT.

-- daniel

John Vandenberg

30 Jul 30 Jul

10:49 a.m.

On Fri, Jul 30, 2010 at 6:23 AM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:

...

On Thu, Jul 29, 2010 at 4:07 PM, Strainu strainu10@gmail.com wrote:

...
Could you please elaborate on that? Thanks.

When pages are parsed, the parsed version is cached, since parsing can take a long time (sometimes > 10 s). Some preferences change how pages are parsed, so different copies need to be stored based on those preferences. If these settings are all default for you, you'll be using the same parser cache copies as anonymous users, so you're extremely likely to get a parser cache hit. If any of them is non-default, you'll only get a parser cache hit if someone with your exact parser-related preferences viewed the page since it was last changed; otherwise it will have to reparse the page just for you, which will take a long time.

This is probably a bad thing.

Could we add a logged-in-reader mode, for people who are infrequent contributors but wish to be logged in for the prefs.

They could be served a slightly old cached version of the page when one is available for their prefs. e.g. if the cached version is less than a minute old. The down side is that if they see an error, it may already be fixed. OTOH, if the page is being revised frequently, the same is likely to happen anyway. The text could be stale before it hits the wire due to parsing delay.

For pending changes, the pref 'Always show the latest accepted revision (if there is one) of a page by default' could be enabled by default. Was there any discussion about the default setting for this pref?

-- John Vandenberg

Strainu

1:16 p.m.

On Fri, Jul 30, 2010 at 11:49 AM, John Vandenberg jayvdb@gmail.com wrote:

...

Could we add a logged-in-reader mode, for people who are infrequent contributors but wish to be logged in for the prefs.

They could be served a slightly old cached version of the page when one is available for their prefs. e.g. if the cached version is less than a minute old. The down side is that if they see an error, it may already be fixed. OTOH, if the page is being revised frequently, the same is likely to happen anyway. The text could be stale before it hits the wire due to parsing delay.

That could work on the first 3-5 wikipedias by number of visitors, for the rest you are most likely to serve VERY old versions (or just re-parse the page if you put a low threshold).

Strainu

Tei

2 Aug 2 Aug

12:51 p.m.

On 28 July 2010 21:13, jidanni@jidanni.org wrote:

...

Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

It seems fast here: 130ms.

The first load of the homepage can be slow: http://zerror.com/unorganized/wika/lader1.png http://en.wikipedia.org/wiki/Main_Page (I need a bigger monitor, the escalator don't fit on my screen)

-- -- ℱin del ℳensaje.

Domas Mituzas

1:25 p.m.

...

The first load of the homepage can be slow: http://zerror.com/unorganized/wika/lader1.png http://en.wikipedia.org/wiki/Main_Page (I need a bigger monitor, the escalator don't fit on my screen)

well, no wonder that first page load is sluggish, with 12 style sheets, and 12 javascript files - there're plenty of low hanging fruits there.

Domas

Tei

2:51 p.m.

On 2 August 2010 13:25, Domas Mituzas midom.lists@gmail.com wrote:

...

...
The first load of the homepage can be slow: http://zerror.com/unorganized/wika/lader1.png http://en.wikipedia.org/wiki/Main_Page (I need a bigger monitor, the escalator don't fit on my screen)

well, no wonder that first page load is sluggish, with 12 style sheets, and 12 javascript files - there're plenty of low hanging fruits there.

Maybe a theme can get the individual icons that the theme use, and combine it all in a single png file.

Updating any icon on that theme would result on updating the whole "combine" image ( I herd ImageMagick have a tool just for that, so can be scripted). This seems what sites like www.google.com and www.gmail.com are doing.

I say theme, because I suppose all other images live in a wiki world and can't be combined that way.

Maybe the idea than resource=file must die in 2011 internet :-/

-- -- ℱin del ℳensaje.

Roan Kattouw

3:24 p.m.

2010/8/2 Tei oscar.vives@gmail.com:

...

Maybe a theme can get the individual icons that the theme use, and combine it all in a single png file.

This technique is called spriting, and the single combined image file is called a sprite. We've done this with e.g. the enhanced toolbar buttons, but it doesn't work in all cases.

...

Maybe the idea than resource=file must die in 2011 internet :-/

The resourceloader branch contains work in progress on aggressively combining and minifying JavaScript and CSS. The mapping of one resource = one file will be preserved, but the mapping of one resource = one REQUEST will die: it'll be possible, and encouraged, to obtain multiple resources in one request.

Roan Kattouw (Catrope)

John Vandenberg

4:50 p.m.

On Mon, Aug 2, 2010 at 11:24 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...

The resourceloader branch contains work in progress on aggressively combining and minifying JavaScript and CSS. The mapping of one resource = one file will be preserved, but the mapping of one resource = one REQUEST will die: it'll be possible, and encouraged, to obtain multiple resources in one request.

Does that approach gain much over HTTP pipelining?

-- John Vandenberg

Aryeh Gregor

5:23 p.m.

On Mon, Aug 2, 2010 at 10:50 AM, John Vandenberg jayvdb@gmail.com wrote:

...

Does that approach gain much over HTTP pipelining?

Yes, because browsers don't HTTP pipeline in practice, because transparent proxies at ISPs cause sites to break if they do that, and there's no reliable way to detect them. Opera does pipelining and blacklists bad ISPs or something, I think. See:

https://bugzilla.mozilla.org/show_bug.cgi?id=264354

Tei

5:15 p.m.

On 2 August 2010 15:24, Roan Kattouw roan.kattouw@gmail.com wrote: ...

...

...
Maybe the idea than resource=file must die in 2011 internet :-/

The resourceloader branch contains work in progress on aggressively combining and minifying JavaScript and CSS. The mapping of one resource = one file will be preserved, but the mapping of one resource = one REQUEST will die: it'll be possible, and encouraged, to obtain multiple resources in one request.

:-O

That is awesome solution, considering the complex of the real world problems. Elegant, and probably as side effect may remove some bloat.

-- -- ℱin del ℳensaje.

Tei

11 Aug 11 Aug

6:51 p.m.

On 2 August 2010 15:24, Roan Kattouw roan.kattouw@gmail.com wrote:

...

2010/8/2 Tei oscar.vives@gmail.com:

...
Maybe a theme can get the individual icons that the theme use, and combine it all in a single png file.

This technique is called spriting, and the single combined image file is called a sprite. We've done this with e.g. the enhanced toolbar buttons, but it doesn't work in all cases.

...
Maybe the idea than resource=file must die in 2011 internet :-/

The resourceloader branch contains work in progress on aggressively combining and minifying JavaScript and CSS. The mapping of one resource = one file will be preserved, but the mapping of one resource = one REQUEST will die: it'll be possible, and encouraged, to obtain multiple resources in one request.

A friend a recomended to me a excellent book (yes books are still usefull on this digital age). Is called "Even Faster Websites". Everyone sould make his company buy this book. Is excellent.

Reading this book has scared me for life. There are things that are worst than I trough. JS forcing everything monothread (even stoping the download of new resources!)... while it download ..and while it executes. How about a 90% of the code is not needed in onload, but is loaded before onload anyway. Probably is a much better idea to read that book that my post (thats a good line, I will end my email with it).

Some comments on Wikipedia speed:

1) This is not a website "http://en.wikipedia.org", is a redirection to this: http://en.wikipedia.org/wiki/Main_Page Can't "http://en.wikipedia.org/wiki/Main_Page" be served from "http://en.wikipedia.org%22?

Wait.. this will break relative links on the frontpage, but.. these are absolute! <a href="/wiki/Wikipedia" title="Wikipedia">Wikipedia</a>

2) The CSS load fine. \o/ Probabbly the combining effort will save speed anyway.

3) Probably the CSS rules can be optimized for speed )-: Probably not.

4) A bunch of js files!, and load one after another, secuential. This is worse than a C program written to a file from disk reading byte by byte. !! Combining will probably save a lot. Or using a strategy to force the browser to concurrent download + lineal execute, these files.

5) There are a lot of img files. Do the page really need than much? sprinting?.

Total: 13.63 seconds.

You guys want to make this faster with cache optimization. But maybe is not bandwith the problem, but latency. Latency accumulate even with HEAD request that result in 302. All the 302 in the world will not make the page feel smooth, if already acummulate into 3+ seconds territory. ...Or I am wrong?

Probably is a much better idea to read that book that my post

-- -- ℱin del ℳensaje.

Roan Kattouw

7 p.m.

2010/8/11 Tei oscar.vives@gmail.com:

...

This is not a website "http://en.wikipedia.org", is a redirection to this: http://en.wikipedia.org/wiki/Main_Page Can't "http://en.wikipedia.org/wiki/Main_Page" be served from "http://en.wikipedia.org%22?

Wait.. this will break relative links on the frontpage, but.. these are absolute! <a href="/wiki/Wikipedia" title="Wikipedia">Wikipedia</a>

That would get complicated with Squid cache AFAIK. One redirect (which is also a 301 Moved Permanently, which means clients may cache the redirect destination) isn't that bad, right?

...

A bunch of js files!, and load one after another, secuential. This is worse than a C program written to a file from disk reading byte by byte. !! Combining will probably save a lot. Or using a strategy to force the browser to concurrent download + lineal execute, these files.

I'll quote my own post from this thread:

...

...
The resourceloader branch contains work in progress on aggressively combining and minifying JavaScript and CSS. The mapping of one resource = one file will be preserved, but the mapping of one resource = one REQUEST will die: it'll be possible, and encouraged, to obtain multiple resources in one request.

We're aware of this problem, or we wouldn't be spending paid developers' time on this resource loader project.

...

You guys want to make this faster with cache optimization. But maybe is not bandwith the problem, but latency. Latency accumulate even with HEAD request that result in 302. All the 302 in the world will not make the page feel smooth, if already acummulate into 3+ seconds territory. ...Or I am wrong?

I'm assuming you mean 304 (Not Modified)? 302 (Found) means the same as 301 except it's not cacheable.

We're not intending to do many requests resulting in 304s, we're intending on reducing the number of requests made and on keeping the long client-side cache expiry times (Cache-Control: maxage=largenumber) that we already use.

Roan Kattouw (Catrope)

Aryeh Gregor

9:52 p.m.

On Wed, Aug 11, 2010 at 12:51 PM, Tei oscar.vives@gmail.com wrote:

...

Reading this book has scared me for life. There are things that are worst than I trough. JS forcing everything monothread (even stoping the download of new resources!)... while it download ..and while it executes.

In newer browsers this is no longer the case. They can fetch other resources while script is loading. They can't begin rendering further until the script finishes executing, but this isn't such a big issue, since scripts usually don't do much work at the point of inclusion. (As Roan says, work is undergoing to improve this, but I thought I'd point out that it's not quite as bad as you say.)

...

There are a lot of img files. Do the page really need than much? sprinting?.

Total: 13.63 seconds.

Some usability stuff is sprited, I think. Overall, though, spriting is a pain in the neck, and we don't load enough images that it's necessarily worth it to sprite too aggressively. Image loads don't block page layout, so it's not a huge deal. I think script optimization is much more important right now.

...

You guys want to make this faster with cache optimization. But maybe is not bandwith the problem, but latency. Latency accumulate even with HEAD request that result in 302. All the 302 in the world will not make the page feel smooth, if already acummulate into 3+ seconds territory. ...Or I am wrong?

I've noticed that when browsing from my phone, the redirect to m. is a noticeable delay, sometimes a second or more. We don't serve many redirects other than that, though, AFAIK.

...

Probably is a much better idea to read that book that my post

I've read Steve Souders' High-Performance Websites, which is probably pretty similar in content.

Roan Kattouw

11:55 p.m.

2010/8/11 Aryeh Gregor Simetrical+wikilist@gmail.com:

...

I've noticed that when browsing from my phone, the redirect to m. is a noticeable delay, sometimes a second or more. We don't serve many redirects other than that, though, AFAIK.

Is that a server-side redirect, or is it done in JS? In the latter case, it taking long would make sense, and would actually be slowed down by moving all <script> tags to the bottom (incidentally, that's what I've done today in the resourceloader branch).

Roan Kattouw (Catrope)

Aryeh Gregor

12 Aug 12 Aug

12:42 a.m.

On Wed, Aug 11, 2010 at 5:55 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...

Is that a server-side redirect, or is it done in JS? In the latter case, it taking long would make sense, and would actually be slowed down by moving all <script> tags to the bottom (incidentally, that's what I've done today in the resourceloader branch).

I have no idea. If it were server-side, it would have to be done in Squid, presumably. A JS redirect would explain a lot of the slowness -- an HTTP redirect shouldn't be that slow.

Platonides

1:40 a.m.

Aryeh Gregor wrote:

...

On Wed, Aug 11, 2010 at 5:55 PM, Roan Kattouw roan.kattouw@gmail.com wrote:

...
Is that a server-side redirect, or is it done in JS? In the latter case, it taking long would make sense, and would actually be slowed down by moving all <script> tags to the bottom (incidentally, that's what I've done today in the resourceloader branch).

I have no idea. If it were server-side, it would have to be done in Squid, presumably. A JS redirect would explain a lot of the slowness -- an HTTP redirect shouldn't be that slow.

It is a javascript (see extensions/WikimediaMobile) Plus, those browsers won't be too optimized.

Domas Mituzas

12:01 a.m.

Hi!

<3 enthusiasm :)

...

This is not a website "http://en.wikipedia.org", is a redirection to this: http://en.wikipedia.org/wiki/Main_Page Can't "http://en.wikipedia.org/wiki/Main_Page" be served from "http://en.wikipedia.org%22?

Our major entrance is not via main page usually, so this would be a niche optimization that does not really matter that much (well, ~2% of article views go to main page, and only 15% of that are loading http://en.wikipedia.org/, and... :)

...

The CSS load fine. \o/

No, they don't, at least not on first pageview.

...

Probabbly the combining effort will save speed anyway.

Yes. We have way too many separate css assets.

...

A bunch of js files!, and load one after another, secuential. This is worse than a C program written to a file from disk reading byte by byte. !!

Actually, if a program reads byte by byte, whole page is already cached by OS, so it is not that expensive ;-) And yes, we know that we have a bit too many JS files loaded, and there's work to fix that (Roan wrote about that).

...

Combining will probably save a lot. Or using a strategy to force the browser to concurrent download + lineal execute, these files.

:-) Thanks for stating obvious.

...

There are a lot of img files. Do the page really need than much? sprinting?.

It is PITA to sprite (not sprint) community uploaded images, and again, that would work only for front page, which is not our main target. Skin should of course be sprited.

...

Total: 13.63 seconds.

Quite slow connection you've got there. I get 1s rendering times with cross-atlantic trips (and much better times if I get served by European caches :)

...

You guys want to make this faster with cache optimization. But maybe is not bandwith the problem, but latency. Latency accumulate even with HEAD request that result in 302. All the 302 in the world will not make the page feel smooth, if already acummulate into 3+ seconds territory. ...Or I am wrong?

You are. First of all, skin assets are not doing IMS requests, they are all cached. We force browsers to do IMS on page views so that browsers would pick up edits (it is a wiki).

...

Probably is a much better idea to read that book that my post

I'm sorry to disappoint you but none of the issues you wrote down here are any new. If after reading any books or posts you think we have deficiencies, mostly it is because of one of two reasons, either because we're lazy and didn't implement, or because it is something we need to maintain wiki model.

Though of course, it is all fresh and scared you for life, we've been doing this for life. ;-)

Domas

Tei

13 Aug 13 Aug

12:55 p.m.

On 12 August 2010 00:01, Domas Mituzas midom.lists@gmail.com wrote: ...

...

I'm sorry to disappoint you but none of the issues you wrote down here are any new. If after reading any books or posts you think we have deficiencies, mostly it is because of one of two reasons, either because we're lazy and didn't implement, or because it is something we need to maintain wiki model.

I am not dissapointed. The wiki model make it hard, because everything can be modified, because the whole thing is giganteous and have a innertia, and the need to support a giganteous list of languages that will make the United Nations looks like timid. And I know you guys are a awesome bunch. And lots of eyes has ben put on the problems.

This make mediawiki a ideal scenario to think about tecniques to make the web faster.

Heres a cookie, a really nice plugin for firebug to check speed. http://code.google.com/p/page-speed/

-- -- ℱin del ℳensaje.

Aryeh Gregor

7:38 p.m.

On Fri, Aug 13, 2010 at 6:55 AM, Tei oscar.vives@gmail.com wrote:

...

I am not dissapointed. The wiki model make it hard, because everything can be modified, because the whole thing is giganteous and have a innertia, and the need to support a giganteous list of languages that will make the United Nations looks like timid.

Actually, wikis are much easier to optimize than most other classes of apps. The pages only change rarely compared to something like Facebook or Google, which really has to regenerate every single page customized to the view. That's why we get by with so little money compared to real organizations.

jidanni＠jidanni.org

28 Sep 28 Sep

12:21 a.m.

...

...
...
...
...
"AG" == Aryeh Gregor Simetrical+wikilist@gmail.com writes:

AG> Facebook...

Speaking of which, I hear they compile their PHP for extra speed. Anyway, http://www.useit.com/alertbox/response-times.html mentions the pain of reading slow sites.

Oldak Quill

2 Aug 2 Aug

1:13 p.m.

On 28 July 2010 20:13, jidanni@jidanni.org wrote:

...

Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

Maybe Mediawiki is not "optimized".

For what it's worth, Alexa.com lists the average load time of the websites they catalogue. I'm not sure what the metrics they use are, and I would guess they hit the squid cache and are in the United States.

Alexa.com list the following average load times as of now:

wikipedia.org: Fast (1.016 Seconds), 74% of sites are slower. facebook.com: Average (1.663 Seconds), 50% of sites are slower.

Oldak Quill

1:21 p.m.

On 2 August 2010 12:13, Oldak Quill oldakquill@gmail.com wrote:

...

On 28 July 2010 20:13, jidanni@jidanni.org wrote:

...
Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

Maybe Mediawiki is not "optimized".

For what it's worth, Alexa.com lists the average load time of the websites they catalogue. I'm not sure what the metrics they use are, and I would guess they hit the squid cache and are in the United States.

Alexa.com list the following average load times as of now:

wikipedia.org: Fast (1.016 Seconds), 74% of sites are slower. facebook.com: Average (1.663 Seconds), 50% of sites are slower.

An addendum to the above message:

According to the Alexa.com help page "Average Load Times: Speed Statistics" (http://www.alexa.com/help/viewtopic.php?f=6&t=1042): "The Average Load Time ... [is] based on load times experienced by Alexa users, and measured by the Alexa Toolbar, during their regular web browsing."

So although US browsers might be overrepresented in this sample (I'm just guessing, I have no figures to support this statement), the Alexa sample should include many non-US browsers, assuming that the figure reported by Alexa.com is reflective of its userbase.

-- Oldak Quill (oldakquill@gmail.com)

Happy-melon

3 Aug 3 Aug

9:16 p.m.

"Oldak Quill" oldakquill@gmail.com wrote in message news:AANLkTik8sqmaEtWVG8eta+cA49i08rFBrmvicSms+y34@mail.gmail.com...

...

On 2 August 2010 12:13, Oldak Quill oldakquill@gmail.com wrote:

...
On 28 July 2010 20:13, jidanni@jidanni.org wrote:

...
Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse.

No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me.

Seems facebook is more zippy.

Maybe Mediawiki is not "optimized".

For what it's worth, Alexa.com lists the average load time of the websites they catalogue. I'm not sure what the metrics they use are, and I would guess they hit the squid cache and are in the United States.

Alexa.com list the following average load times as of now:

wikipedia.org: Fast (1.016 Seconds), 74% of sites are slower. facebook.com: Average (1.663 Seconds), 50% of sites are slower.

An addendum to the above message:

According to the Alexa.com help page "Average Load Times: Speed Statistics" (http://www.alexa.com/help/viewtopic.php?f=6&t=1042): "The Average Load Time ... [is] based on load times experienced by Alexa users, and measured by the Alexa Toolbar, during their regular web browsing."

So although US browsers might be overrepresented in this sample (I'm just guessing, I have no figures to support this statement), the Alexa sample should include many non-US browsers, assuming that the figure reported by Alexa.com is reflective of its userbase.

And the average Alexa toolbar user is logged in to facebook and using it to see what their friends were up to last night, with masses of personalised content; while the average Alexa toolbar user is a reader seeing the same page as everyone else. We definitely have the theoretical advantage.

--HM

5211

Age (days ago)

5272

Last active (days ago)

wikitech-l@lists.wikimedia.org

65 comments

19 participants

tags (0)

participants (19)

Alex Brollo
Andrew Garrett
Aryeh Gregor
Chad
Daniel Friesen
Daniel Kinzler
David Goodman
Domas Mituzas
Happy-melon
jidanni＠jidanni.org
John Vandenberg
K. Peachey
Lars Aronsson
Liangent
Oldak Quill
Platonides
Roan Kattouw
Strainu
Tei