-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Tim Starling wrote:
We've used xdebug, but it will only give you a very small sample size, any more than a few requests and the files get too large to handle.
Hmm, I've tried XDebug for profiling and I didn't like it very much. You may want to try APD.
Our own profiler can average over an arbitrary number of requests at quite a high sample rate. For a breakdown of Setup.php and CommonSettings.php, see
http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=name&limit...
The report only offers a few dozen or so high-level functions. Could we further break them down to see if it's a dumb inefficiency and not systematic (for instance, Setup.php-misc2 takes 15% of Setup.php processing. I wonder what it does?)
A function breakdown, even if its only performed once, would be interesting to see. (of course, I could run the profiling myself, but they wouldn't be very representative) (someone could, theoretically speaking, build a processor that would average multiple runs too, but the base problem of having a heck of a lot more information remains).
Tim Starling wrote:
Sure, but that would require touching many more lines of code.
The price of abstraction. ;-) (Whether or not it's worth paying is another issue.)
We've found extract(unserialize(...)) to be quite fast. A configuration object would be useful if we wanted to operate on multiple wikis in the course of a single run, but that's complicated for lots of other reasons. In any case, the changes I'm proposing would be a useful step towards getting rid of configuration globals, it's not mutually exclusive.
One step at a time, I suppose. Of course, you could always unserialize the registry... Configuration object would take a lot more work though, so we'd probably want to lay off that for now.
A module, by my meaning, can be either part of the core or an extension. It's not about orderly fashions, it's about speed, and the changes I'm talking about would potentially have application to the core as well as the extensions.
That clarified things a bit. I think that you've got the experience (and intuition) to home in on the spot that's inefficient, so I won't comment on that.
Still trying to be helpful, How do you plan on clearing these serialized caches?