Edward Z. Yang wrote:
Tim Starling wrote:
The general idea is to slash the number of lines of code loaded on startup to a tiny fraction of what it is now.
Sounds like a good idea.
Tim Starling wrote:
Profiling on Wikimedia currently gives 47ms for startup, including at least 8ms for extensions. This is too slow.
Hmm... it would be a little nicer if we knew precisely what was causing that slowdown inside the startup. I know that MediaWiki has built in profiling calls, but have you tried a PHP-level profiler like APD yet?
We've used xdebug, but it will only give you a very small sample size, any more than a few requests and the files get too large to handle. Our own profiler can average over an arbitrary number of requests at quite a high sample rate. For a breakdown of Setup.php and CommonSettings.php, see
http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=name&limit...
Pervasive lazy initialisation. Remove $wgUser, $wgLang and $wgContLang, replace them with wfGetUser(), wfGetLang() and wfGetContLang(). This brings them into line with the successful wfGetDB() interface. Simply returning the variable would be done by those functions, if it doesn't exist, then execution would pass to the relevant autoloaded module for initialisation. The same can be done for $wgTitle, $wgParser, $wgOut, $wgMessageCache... pretty much all the object globals. A version-independent accessor would be put in ExtensionFunctions.php, to support multi-version extensions. By deferring everything, we allow lightweight requests to get by without ever initialising a user object, or loading a language file.
It's probably a step in the right direction. But maybe it would be a better idea if we unified all these global object calls into a Registry object? (Perhaps this would be phasing them out that you described here: http://meta.wikimedia.org/w/index.php?title=Writing_a_new_special_page&d... )
Sure, but that would require touching many more lines of code. We've found extract(unserialize(...)) to be quite fast. A configuration object would be useful if we wanted to operate on multiple wikis in the course of a single run, but that's complicated for lots of other reasons. In any case, the changes I'm proposing would be a useful step towards getting rid of configuration globals, it's not mutually exclusive.
Tim Starling wrote:
Cached module registration.
When you say "module" you mean, "extension", correct? So, essentially, we're finally going to structure the extensions into a much more orderly fashion. Sounds good.
A module, by my meaning, can be either part of the core or an extension. It's not about orderly fashions, it's about speed, and the changes I'm talking about would potentially have application to the core as well as the extensions.
Tim Starling wrote:
[on configuration]
I don't know enough about Wikimedia configuration to pass judgment here, but that sounds like a lot of code that hasn't been versioned.
The configuration cache is only 35 lines or so. The SiteConfiguration class, which is checked in, was designed to be easily cached. Still, it would be nice to get that capability into the main codebase, for improved offsite performance and easier development.
-- Tim Starling