For most cases, the vast majority of a wiki's traffic is from non-logged in
users. All kinds of caches should then work in a way where the page does
not have to be rendered again with PHP. An example is Mediawiki's File
caching system:
http://www.mediawiki.org/wiki/Manual:File_cache
It's commonly known that PHP files take a lot more CPU than static content
and this is also the concept behind MW's File caching system.
My situation: I'm on a shared server where they don't want me to go above
certain CPU limits (cpu seconds/per hour). I'm not able to install Squid,
APC or memcached. Lately I've been having problems with CPU usage due to
traffic surges and malicious bots. I don't want to spend more money on
hosting if I don't have to but that option is open if the server company
thinks I should upgrade. I want to be a good client and not effect other
users on the server.
Here's a problem I see with MW's File caching system. It still processes
PHP files, e.g. here's some actual lines of code from my wiki's page when
it loads a page from the File caching system. Wikipedia also loads these
PHP files, thus increasing CPU usage:
------------
<link rel="stylesheet"
href="http://mywikisite.com/w/*load.php*?debug=false&lang=en&a…
/>
<link rel="stylesheet"
href="http://mywikisite.com/w/*load.php*?debug=false&lang=en&a…
/>
<script
src="http://mywikisite.com/w/*load.php*
?debug=false&lang=en&modules=skins.vector&only=scripts&skin=vector&*"></script>
<script
src="http://mywikisite.com/w/*load.php*
?debug=false&lang=en&modules=site&only=scripts&skin=vector&*"></script>
------------
To confirm this, I have seen the static HTML file generated by the cache,
and these lines are present in the HTML code that is viewed from the
browser or an HTML editor. So load.php is being made to run at least 4
times during each page load. It may be 3 times for my site if Flagged
revisions wasnt installed, but again, Wikipedia has similiar lines of code
which make multiple calls to Load.php. Yesterday I had a huge traffic spike
and the server process scan confirmed that Load.php was running a lot of
times. If three pages are loaded at about the same time, that means 12
calls to Load.php.
I've also compared situations where I wasn't using any cache and where I
was using the File cache, and I didn't see any noticeable difference in the
CPU usage.
So I think MW's File caching system should be improved so that no PHP
processing is required at all for non-logged users. After all, the same
exact copy of the page is going to be served to non-logged in users so it
makes sense to have 100% of that content static so it doesn't require any
PHP processing at all. The only time PHP should run is when content
changes. That should refresh the cache and regenerate the static content.
In my case, I also have a mobile skin so I've modified LocalSettings.php to
use the Mobile cache directory if its a mobile user so I have two different
sets of cache, one for the computer screen and another for mobile users.
I know this isn't a problem for Wikipedia because they have a lot of
servers and have additional great caching systems (squid, memcached, etc)
so everything is fast.But I'm thinking if those calls to Load.php were cut
down, it would make it possible for Wikipedia to use less servers and would
also make everyone else's sites run faster.
In any case, PHP processing should be used minimally, only when necessary.
If the page had no calls to PHP files, it would use less CPU and again, if
the same page is being served to non-logged in users, ideally there should
be no or very little PHP processing.
Any thoughts from the developers? Is it possible to modify the File caching
system to eliminate calls to Load.php, so more of the content served is
static?
thanks
Dan