Gregory Szorc wrote:
Please read my own proposal for reworking the extension interface at:
http://mail.wikipedia.org/pipermail/wikitech-l/2006-July/037035.html
Posted to this list 10 days ago.
Forgive my ignorance. I read the first few paragraphs of the post when it was originally sent and ignored the rest, just thinking it was another Wikipedia-only message. Now, having read it...
I do like your proposal for static objects being initialized as-needed. There is great power in the just-in-time object::getInstance() method. However, one of my criticisms of MediaWiki's architecture has always been the over dependence on global objects, which are in some ways like static classes using the Singleton pattern (see http://blog.case.edu/gps10/2006/07/22/why_global_variables_in_php_is_bad_pro... why I don't like global objects). I would much rather see the Wiki class contain these "global objects" as static variables which can be accessed via a just-in-time getObject() static call to the Wiki class. This sounds like the same approach as the proposed wfGetFoo() methods (it basically is), but polluting the global symbol table with objects and functions not belonging to classes is unecessary when these could all belong to a master Wiki class. If you don't buy the "don't do it because you'd be polluting the symbol table" argument, do it for the sake of keeping everything organized into classes. Do wfGetFoo() functions really belong in the global namespace, or do they belong to a class representing a wiki? Hell, if you get rid of all the global functions and attach them to existing classes, that is one less file to include! </rant on global objects>
I'll answer this at the end.
If you are talking about 500us, why are there still require and require_once calls in the trunk? These both require system calls (require_once actually requires an additional one and hence is slower). I know work has been done developing the __autoload function (if you ever commit to 5.1, spl_autoload_register() is preferred), but at the level of commitment you give to performance, every require_once has to seem like a monkey on your back.
I got rid of about half of the require_once calls from Setup.php in localisation-work. The remaining ones are mostly for global functions.
Also, how do you accurately profile MediaWiki? I've used xdebug and Kcachegrind to profile scripts before, but it always bothers me because I cannot use xdebug alongside APC or eaccelerator to get results reflective of my production deployment. I know APC and eaccelerator completely change the chokepoints, but it is impossible for me to see what the new chokepoints are!
http://noc.wikimedia.org/cgi-bin/report.py
Data is generated with ProfilerSimpleUDP. Averaging over a million requests gives you excellent accuracy, thanks to the central limit theorem. However, that data may be subject to slight systemic inaccuracies due to the profiling overhead.
Can you feed MediaWiki's internal profiling output into Kcachegrind?
No.
Now, getting back to the topic of extensions. For a base extension class, I was thinking of an abstract class that has numerous methods, providesSpecialPage(), providesHook(), providesWhichHooks(), providesParserTags(), etc. Let's say we establish a defined extensions root directory. When MediaWiki loads, it periodically checks this directory for all files representing extensions and loads them (perhaps this is triggered manually via CRON, Special Page, filemtime(), etc). When the extensions are loaded from the directory, a map is established that records the abilities of each. This map is serialized for quick retrieval. Whenever MediaWiki loads, it just goes to the map and loads extensions just-in-time. This would require an extension manager class that would initialize extensions as called for by the map. For example, when the parser sees a tag it doesn't recognize, it would go ExtensionManager::getExtensionForParserTag($foo)->parse($content); Or, when a special page is called, we have ExtensionManager::executeSpecialPage($foo); For hooks, the same deal.
I like the idea for a map between capabilities and callbacks, etc, but I don't like the idea of a module specification file. Why should you need to provide a specification file when the same information can be obtained from methods inherited from a base extension class? As long as you cache the output of these methods, there is zero performance overhead and extensions have the added bonus of being much more structured. Yes, it would break existing functionality. But if you are already talking about making just-in-time calls to instantiate global objects like $wgTitle, $wgOut, etc, then many existing extensions will be broken anyway. Sometimes you just have to make sacrifices for the sake of progress.
You obviously also missed my post on stub globals. After I made the first post, I discovered a method for painless migration to deferred object initialisation, and I made that the topic of a second post. I share your concerns about the flexibility of global variables, in fact my discussion of the issue in phase3/docs/globals.txt mirrors your blog post very closely. But there doesn't seem to be any pressing need to sacrifice backwards compatibility while we pursue this goal. But a singleton object shares all the flexibilty problems of global variables, so that's not a solution.
You make a good point about the fact that capabilities can be provided by a member function and cached. There's still no need to sacrifice backwards compatibility though, that I can see. We can keep both the ability of extensions to operate across multiple MediaWiki versions, and the ability for most old extensions to continue to work properly in new versions of MediaWiki, if we design the interface carefully enough. If backwards compatibility for old extensions proves to be too much of a performance burden, then we can drop it after a couple of releases. What I want to avoid is the requirement that extensions be simultaneously updated along with the core. That's a hassle for both site administrators and extension developers. Especially since many extensions are unreleased, their versions unnumbered.
-- Tim Starling