Client-side applications are becoming more important. There are plans for using AJAX, and request rates for machine interfaces such as query.php, action=raw and Special:Export are growing. These lightweight requests are often predominantly composed of startup costs. Profiling on Wikimedia currently gives 47ms for startup, including at least 8ms for extensions. This is too slow. My ideas for fixing this are as follows:
* Pervasive lazy initialisation. Remove $wgUser, $wgLang and $wgContLang, replace them with wfGetUser(), wfGetLang() and wfGetContLang(). This brings them into line with the successful wfGetDB() interface. Simply returning the variable would be done by those functions, if it doesn't exist, then execution would pass to the relevant autoloaded module for initialisation. The same can be done for $wgTitle, $wgParser, $wgOut, $wgMessageCache... pretty much all the object globals. A version-independent accessor would be put in ExtensionFunctions.php, to support multi-version extensions. By deferring everything, we allow lightweight requests to get by without ever initialising a user object, or loading a language file.
* Cached module registration. Each module provides a module.php in a unique directory, which sets a $module variable to an array describing its capabilities. Capabilities would include:
* Autoloadable classes and their locations * Hooks * Special pages * Magic words * Messages * Parser elements and functions
The registered callback functions would all be static functions in autoloaded classes. To register modules at startup, a function will be called with an array listing the extension directories. This function will load the module.php files and merge the arrays, creating a master hashtable of capabilities and their locations. And crucially, this hashtable can be cached. This reduces average-case module initialisation from tens of lines of code per module to a single unserialize() for all modules.
The old extension setup files would be kept for backwards compatibility, but would never be loaded unless the extension explicitly does so when it obtains control via a hook.
* Bring Wikimedia's configuration cache into the committed codebase, and extend its abilities. Incorporate settings from DefaultSettings.php into the SiteConfiguration object, thus allowing them to be cached, avoiding a DefaultSettings.php load. The bulk of the default LocalSettings.php can be moved to a separate file, analogous to Wikimedia's InitialiseSettings.php. Thus in the typical case, all configuration, except programmatic modifications, will be done by a single unserialize plus a few stat calls.
The general idea is to slash the number of lines of code loaded on startup to a tiny fraction of what it is now. Any thoughts?
-- Tim Starling
Tim Starling wrote:
The general idea is to slash the number of lines of code loaded on startup to a tiny fraction of what it is now.
Sounds like a good idea.
Tim Starling wrote:
Profiling on Wikimedia currently gives 47ms for startup, including at least 8ms for extensions. This is too slow.
Hmm... it would be a little nicer if we knew precisely what was causing that slowdown inside the startup. I know that MediaWiki has built in profiling calls, but have you tried a PHP-level profiler like APD yet?
Tim Starling wrote:
Pervasive lazy initialisation. Remove $wgUser, $wgLang and $wgContLang, replace them with wfGetUser(), wfGetLang() and wfGetContLang(). This brings them into line with the successful wfGetDB() interface. Simply returning the variable would be done by those functions, if it doesn't exist, then execution would pass to the relevant autoloaded module for initialisation. The same can be done for $wgTitle, $wgParser, $wgOut, $wgMessageCache... pretty much all the object globals. A version-independent accessor would be put in ExtensionFunctions.php, to support multi-version extensions. By deferring everything, we allow lightweight requests to get by without ever initialising a user object, or loading a language file.
It's probably a step in the right direction. But maybe it would be a better idea if we unified all these global object calls into a Registry object? (Perhaps this would be phasing them out that you described here: http://meta.wikimedia.org/w/index.php?title=Writing_a_new_special_page&d... )
Tim Starling wrote:
Cached module registration.
When you say "module" you mean, "extension", correct? So, essentially, we're finally going to structure the extensions into a much more orderly fashion. Sounds good.
Tim Starling wrote:
[on configuration]
I don't know enough about Wikimedia configuration to pass judgment here, but that sounds like a lot of code that hasn't been versioned.
Edward Z. Yang wrote:
Tim Starling wrote:
The general idea is to slash the number of lines of code loaded on startup to a tiny fraction of what it is now.
Sounds like a good idea.
Tim Starling wrote:
Profiling on Wikimedia currently gives 47ms for startup, including at least 8ms for extensions. This is too slow.
Hmm... it would be a little nicer if we knew precisely what was causing that slowdown inside the startup. I know that MediaWiki has built in profiling calls, but have you tried a PHP-level profiler like APD yet?
We've used xdebug, but it will only give you a very small sample size, any more than a few requests and the files get too large to handle. Our own profiler can average over an arbitrary number of requests at quite a high sample rate. For a breakdown of Setup.php and CommonSettings.php, see
http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=name&limit...
Pervasive lazy initialisation. Remove $wgUser, $wgLang and $wgContLang, replace them with wfGetUser(), wfGetLang() and wfGetContLang(). This brings them into line with the successful wfGetDB() interface. Simply returning the variable would be done by those functions, if it doesn't exist, then execution would pass to the relevant autoloaded module for initialisation. The same can be done for $wgTitle, $wgParser, $wgOut, $wgMessageCache... pretty much all the object globals. A version-independent accessor would be put in ExtensionFunctions.php, to support multi-version extensions. By deferring everything, we allow lightweight requests to get by without ever initialising a user object, or loading a language file.
It's probably a step in the right direction. But maybe it would be a better idea if we unified all these global object calls into a Registry object? (Perhaps this would be phasing them out that you described here: http://meta.wikimedia.org/w/index.php?title=Writing_a_new_special_page&d... )
Sure, but that would require touching many more lines of code. We've found extract(unserialize(...)) to be quite fast. A configuration object would be useful if we wanted to operate on multiple wikis in the course of a single run, but that's complicated for lots of other reasons. In any case, the changes I'm proposing would be a useful step towards getting rid of configuration globals, it's not mutually exclusive.
Tim Starling wrote:
Cached module registration.
When you say "module" you mean, "extension", correct? So, essentially, we're finally going to structure the extensions into a much more orderly fashion. Sounds good.
A module, by my meaning, can be either part of the core or an extension. It's not about orderly fashions, it's about speed, and the changes I'm talking about would potentially have application to the core as well as the extensions.
Tim Starling wrote:
[on configuration]
I don't know enough about Wikimedia configuration to pass judgment here, but that sounds like a lot of code that hasn't been versioned.
The configuration cache is only 35 lines or so. The SiteConfiguration class, which is checked in, was designed to be easily cached. Still, it would be nice to get that capability into the main codebase, for improved offsite performance and easier development.
-- Tim Starling
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Tim Starling wrote:
We've used xdebug, but it will only give you a very small sample size, any more than a few requests and the files get too large to handle.
Hmm, I've tried XDebug for profiling and I didn't like it very much. You may want to try APD.
Our own profiler can average over an arbitrary number of requests at quite a high sample rate. For a breakdown of Setup.php and CommonSettings.php, see
http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=name&limit...
The report only offers a few dozen or so high-level functions. Could we further break them down to see if it's a dumb inefficiency and not systematic (for instance, Setup.php-misc2 takes 15% of Setup.php processing. I wonder what it does?)
A function breakdown, even if its only performed once, would be interesting to see. (of course, I could run the profiling myself, but they wouldn't be very representative) (someone could, theoretically speaking, build a processor that would average multiple runs too, but the base problem of having a heck of a lot more information remains).
Tim Starling wrote:
Sure, but that would require touching many more lines of code.
The price of abstraction. ;-) (Whether or not it's worth paying is another issue.)
We've found extract(unserialize(...)) to be quite fast. A configuration object would be useful if we wanted to operate on multiple wikis in the course of a single run, but that's complicated for lots of other reasons. In any case, the changes I'm proposing would be a useful step towards getting rid of configuration globals, it's not mutually exclusive.
One step at a time, I suppose. Of course, you could always unserialize the registry... Configuration object would take a lot more work though, so we'd probably want to lay off that for now.
A module, by my meaning, can be either part of the core or an extension. It's not about orderly fashions, it's about speed, and the changes I'm talking about would potentially have application to the core as well as the extensions.
That clarified things a bit. I think that you've got the experience (and intuition) to home in on the spot that's inefficient, so I won't comment on that.
Still trying to be helpful, How do you plan on clearing these serialized caches?
Hi!
Hmm, I've tried XDebug for profiling and I didn't like it very much. You may want to try APD.
Well, if you stated reasons for your choice, it would be easier for us to think about it. Did you try xdebug2 with visualization tools?
You may find this interesting: http://dammit.lt/2006/01/18/mediawiki-graphic-profile/
This explains the method used. Windows people may choose to use WinCachegrind.
The report only offers a few dozen or so high-level functions. Could we further break them down to see if it's a dumb inefficiency and not systematic (for instance, Setup.php-misc2 takes 15% of Setup.php processing. I wonder what it does?)
Um, press 'show more', then you'll be shown with more than 50 events.
A function breakdown, even if its only performed once, would be interesting to see. (of course, I could run the profiling myself, but
I'm not sure if we have any private information in traces, probably we may disclose some of them from time to time for systematic analysis anyone wants to do.
Cheers, Domas
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Domas Mituzas wrote:
Well, if you stated reasons for your choice, it would be easier for us to think about it. Did you try xdebug2 with visualization tools?
Nope. (see below)
You may find this interesting: http://dammit.lt/2006/01/18/mediawiki-graphic-profile/
Wow, that's really cool. Extremely cool. (puts blog into RSS reader) I need to read the blogs of Wikimedia developers more often.
BTW, from the diagram, it looks like cleanup takes up the largest amount of time. Unavoidable?
Um, press 'show more', then you'll be shown with more than 50 events.
Yeah, I did press show more. What I meant was that a lot of the profile points are not Setup.php related, the one's that have Setup.php prefixed are only in the dozens. Then again, I just realized that I really don't know anything about optimizing MediaWiki.
I'm not sure if we have any private information in traces, probably we may disclose some of them from time to time for systematic analysis anyone wants to do.
If the traces provide the parameters passed to the functions, probably yes. I can't see APD style calltrees (no parameters at all) having private info, but you guys use XDebug...
Hi!
Wow, that's really cool. Extremely cool. (puts blog into RSS reader) I need to read the blogs of Wikimedia developers more often.
Yeah, once a year we do post something.
BTW, from the diagram, it looks like cleanup takes up the largest amount of time. Unavoidable?
That diagram is one year old ;-) Cleanup is also outputing data to skin, etc. We fixed lots of stuff there already.
I'll probably put some fresh png's soon, would be especially cool if there was some batch way to produce them. The interesting part is that you can click on any node in graph and get detailed information about it - who calls it (and time spent from every caller), what is being called (and again, time spent for each callee), etc.
Yeah, I did press show more. What I meant was that a lot of the profile points are not Setup.php related, the one's that have Setup.php prefixed are only in the dozens. Then again, I just realized that I really don't know anything about optimizing MediaWiki.
The profiling we have there, is kind of real time one (php sending udp messages to collection daemon, hehe, http://dammit.lt/2006/01/11/ profiling-web-applications/ ;-) It is more designed for us to deal with global scope of things going on. Stuff like detailed analysis of every request, all calls, etc, can be done using per-request profiling.
On the other hand, per-request profiling may disclose funny stuff - like now I'm looking at the fact that edit tools are spending like... up to 40% of edit page rendering, because of multiple calls to charinsert plugin. Edittools are deterministic - they won't change at different pages, so rendering them is not required. It would be really nice if we could cache stuff like that, but on the other hand, it may be less than 1% of total cluster time.
That is where we add general profiling hooks and try to figure out what may need serious attention.
Right now anyone on cluster can poke files at srv55:/tmp/traces/ ;-)
If the traces provide the parameters passed to the functions, probably yes. I can't see APD style calltrees (no parameters at all) having private info, but you guys use XDebug...
Our traces don't have parameters now, though, it would be pretty interesting to check with parameters some day :) Hehe, I'll discuss the idea of providing public traces. Few gigabytes of them (that is... a minute run of single apache server) may provide interesting set of data, wouldn't it? :)
BR, Domas
wikitech-l@lists.wikimedia.org