Re: [Wikitech-l] wikipedia is one of the slower sites on the web

1 Aug 2010


      On Fri, Jul 30, 2010 at 1:32 PM, John Vandenberg jayvdb@gmail.com wrote:
...
So you're telling my theoretical logged-in-reader to use default
prefs, or log out, when the reason they are a logged-in-reader is so
they can control their preferences..!
Yep.  You want features, you often pay a performance penalty.  In this
case the performance penalty should be reducible, or at least clearly
marked, but that's a general rule anyway.
...
Surely there are a few common 'preference sets' which large numbers of
readers use?
Changing any parser-related preference will kill page load times.
...
There are plenty of pages which change more than once per minute,
No pages change once per minute on average.  That would be 1440 edits
per day, or more than 500,000 per year.  Only one page on enwiki
(WP:AIAV) has more than 500,000 edits *total*, let alone per year.
There were only 18 edits to WP:ANI between 17:00 and 18:00 today, just
for example, which is less than one edit every three minutes.  There
are some times when a particular page changes many times in a minute
-- like when a major event occurs and everyone rushes to update an
article -- but these are rare and don't last long.
You also seem to be missing how many different possible parser cache
keys there are.  It's not like there are only five or ten possible
versions.  As I said before -- if you change your parser-related
settings around a bunch, you will probably rarely or never hit parser
cache except when you yourself viewed the page since it last changed.
There are too many possible permutations of settings here.
...
however I'd expect a much higher threshold, variable based on the
volume of page activity, or some other mechanism to determine whether
the cached version is acceptably stale for the logged-in-reader.
There is no infrastructure required for extra stale entries.  If the
viewer is happy to accept the slightly stale revision for there chosen
prefs, serve it.  If not, reparse.
Look, this is just not a useful solution, period.  It would be
extremely ineffective.  If you extended the permitted staleness level
so much that it would be moderately effective, it would be useless,
because you'd be seeing hours- or days-old articles.  On the other
hand, for a comparable amount of effort you could implement a solution
that actually is effective, like adding an extra postprocessing stage.
On Fri, Jul 30, 2010 at 8:22 PM,  jidanni@jidanni.org wrote:
...
Hmmm, maybe they're there amongst the "!"s below.
$ lynx --source http://en.wikipedia.org/wiki/Main_Page | grep parser
Expensive parser function count: 44/500
<!-- Saved in parser cache with key enwiki:pcache:idhash:15580374-0!3!0!default!1!en!4!edit=0 and timestamp 20100731001330 -->
Yes.  That key is generated by the following line in
includes/parser/ParserCache.php:
$key = wfMemcKey( 'pcache', 'idhash',
"{$pageid}-{$renderkey}!{$hash}{$edit}{$printable}" );
The relevant bit of that, for us, is $hash, which is generated by
getPageRenderingHash() in includes/User.php:
// stubthreshold is only included below for completeness,
        // it will always be 0 when this function is called by parsercache.
$confstr =        $this->getOption( 'math' );
        $confstr .= '!' . $this->getOption( 'stubthreshold' );
        if ( $wgUseDynamicDates ) {
            $confstr .= '!' . $this->getDatePreference();
        }
        $confstr .= '!' . ( $this->getOption( 'numberheadings' ) ? '1' : '' );
        $confstr .= '!' . $wgLang->getCode();
        $confstr .= '!' . $this->getOption( 'thumbsize' );
        // add in language specific options, if any
        $extra = $wgContLang->getExtraHashOptions();
        $confstr .= $extra;
So anonymous users on enwiki have math=3, stubthreshold=0 (although
the comment indicates this is irrelevant somehow), date preferences =
'default', numberheadings = 1, language = 'en', thumbsize = 4.
Changing any of those from the default will make you miss the parser
cache on enwiki.
On Sat, Jul 31, 2010 at 12:58 PM, Daniel Kinzler daniel@brightbyte.de wrote:
...
This is a few years old, but I guess it's still relevant:
http://brightbyte.de/page/Client-side_skins_with_XSLT I experimented a bit
with ways to do all the per-user preference stuff on the client side, with XSLT.
XSLT seems a bit baroque.  If the goal is to use script to avoid cache
misses, why not just use plain old JavaScript?  A lot more people know
it, it supports progressive rendering (does XSLT?), and it's much
better supported.  In particular, your approach of serving something
other than HTML and relying on XSLT support to transform it will
seriously confuse text browsers, search engines, etc.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] wikipedia is one of the slower sites on the web