On 9/24/09 1:41 AM, Tim Starling wrote:
Trevor Parscal wrote:
If you are really doing a JS2 rewrite/reorganization, would it be possible for some of us (especially those of us who deal almost exclusively with JavaScript these days) to get a chance to ask questions/give feedback/help in general?
I've mostly been working on analysis and planning so far. I made a few false starts with the code and so ended up planning in a more detailed way than I initially intended. I've discussed various issues with the people in #mediawiki, including our resident client-side guru Splarka.
I started off working on fixing the coding style and the most glaring errors from the JS2 branch, but I soon decided that I shouldn't be putting so much effort into that when a lot of the code would have to be deleted or rewritten from scratch.
I did a survey of script loaders in other applications, to get an idea of what features would be desirable. My observations came down to the following:
- The namespacing in Google's jsapi is very nice, with everything
being a member of a global "google" object. We would do well to emulate it, but migrating all JS to such a scheme is beyond the scope of the current project.
- You need to deal with CSS as well as JS. All the script loaders I
looked at did that, except ours. We have a lot of CSS objects that need concatenation, and possibly minification.
- JS loading can be deferred until near the</body> or until the
DOMContentLoaded event. This means that empty-cache requests will render faster. Wordpress places emphasis on this.
- Dependency tracking is useful. The idea is to request a given
module, and all dependencies of that module, such as other scripts, will automatically be loaded first.
I then looked more closely at the current state of script loading in MediaWiki. I made the following observations:
- Most linked objects (styles and scripts) on a typical page view come
from the Skin. If the goal is performance enhancement, then working on the skins and OutputPage has to be a priority.
- The "class" abstraction as implemented in JS2 has very little value
to PHP callers. It's just as easy to use filenames. It could be made more useful with features such as dependency tracking, better concatenation and CSS support. But it seems to me that the most useful abstraction for PHP code would be for client-side modules to be multi-file, potentially with supporting PHP code for each module.
- Central registration of all client-side resources in a global
variable would be onerous and should be avoided.
- Dynamic requests such as [[MediaWiki:Handheld.css]] have a large
impact on site performance and need to be optimised. I'm planning a new interface, similar to action=raw, allowing these objects to be concatenated.
The following design documents are in my user space on mediawiki.org:
http://www.mediawiki.org/wiki/User:Tim_Starling/CSS_and_JS_caller_survey_(r56220)
- A survey of MW functions that add CSS and JS, especially the
terribly confusing situation in Skin and OutputPage
http://www.mediawiki.org/wiki/User:Tim_Starling/JS_load_order_issues_(r56220)
- A breakdown of JS files by the issues that might be had in moving
them to the footer or DOMContentLoaded. I favour a conservative approach, with wikibits.js and the site and user JS staying in the
<head>.
http://www.mediawiki.org/wiki/User:Tim_Starling/Proposed_modularisation_of_client-side_resources
- A proposed reorganisation of core scripts (Skin and OutputPage)
according to the MW modules they are most associated with.
The object model I'm leaning towards on the PHP side is:
- A client-side resource manager (CSRM) class. This would be
responsible for maintaining a list of client-side resources that have been requested and need to be sent to the skin. It would also handle caching, distribution of incoming dynamic requests, dependencies, minification, etc. This is quite a complex job and might need to be split up somewhat.
- A hierarchy of client-side module classes. A module object would
contain a list of files, dependencies and concatenation hints. Objects would be instantiated by parent classes such as skins and special pages, and added to the CSRM. Classes could be registered globally, and then used to generate dynamic CSS and JS, such as the user preference stylesheet.
- The module base class would be non-abstract and featureful, with a
constructor that accepts an array-based description. This allows simple creation of modules by classes with no interest in dynamic script generation.
- A new script loader entry point would provide an interface to
registered modules.
There are some design decisions I still have to make, which are tricky due to performance tradeoffs:
- With concatenation, there is the question of which files to combine
and which to leave separate. I would like to have a "combine" parameter which is a string, and files with the same combine parameter will be combined.
- Like Wordpress, we could store minified and concatenated files in a
public cache and then link to that cache directly in the HTML.
- The cache invalidation scheme is tricky, there's not really an ideal
system. A combination of cache-breaking parameters (like Michael's design) and short expiry times is probably the way to go. Using cache-breaking parameters alone doesn't work because there is referring HTML cached on both the server and client side, and regenerating that HTML periodically would be much more expensive than regenerating the scripts.
Here are my notes:
- Concatenation
- Performance problems:
- Changing inclusions. When inclusions change, whole contents has
to be sent again. * BUT people don't change skins very often. * So combine=all=skin should save time for most * Expiry times have to be synchronised. Take the minimum expiry of all, and force freshness check for all. * Makes the task of squid cache purging more difficult * Defeats browser concurrency
Performance advantages:
- For dynamic requests:
- Avoids MW startup time.
- Avoids DoSing small servers with concurrent requests.
- For all requests:
- Reduces squid CPU
- Removes a few RTTs for non-pipelining clients
- Improves gzip compression ratio
Combine to static file idea:
- Pros:
- Fast to stream out, on all systems
- Doesn't break HughesNet
- Cons:
- Requires splitting the request into static and dynamic
- Need webserver config to add Expires header and gzip
With some help from Splarka, I've determined that it would be possible to merge the requests for [[MediaWiki:Common.css]], [[MediaWiki:Skinname.css]], [[MediaWiki:Handheld.css]] and [[MediaWiki:Print.css]], using @media blocks for the last two, for a significant performance win in almost all cases.
Once the architectural issues have been fixed, the stylistic issues in both ancient JS and the merged code will have to be dealt with, for example:
- Poorly-named functions, classes, files, etc. There's a need for
proper namespacing and consistency in naming style.
Poorly-written comments
Unnecessary use of the global namespace. The jQuery style is nice,
with local functions inside an anonymous closure:
function () { function setup() { ... } addOnloadHook( setup ); }();
- Unsafe construction of HTML. This is ubiquitous in the mwEmbed
directory and there will be a huge potential for XSS, as soon as user input is added. HTML construction with innerHTML can be replaced by document.createElement() or its jQuery equivalent.
- The identity crisis. The whole js2 concept encourages code which is
poorly integrated with the rest of MediaWiki, and which is written without proper study of the existing code or thought to refactoring. It's like SkinTemplate except with a more pretentious name. I'd like to get rid of all instances of "js2", to move its scripts into other directories, and to remove the global variables which turn it on and off. Also the references to MetavidWiki and the mv prefixes should be fixed.
- Lack of modularisation. The proposed registration system makes it
possible to have extensions which are almost entirely client-side code. A module like libClipEdit could be moved to its own extension. I see no problem with extensions depending on other extensions, the SMW extensions do this with no problems.
A few ideas for cool future features also occur to me. Once we have a system set up for generating and caching client-side resources, why not:
- Allow the user to choose a colour scheme for their wiki and
automatically generate stylesheets with the appropriate colours.
- Include images in the system. Use GD to automatically generate and
cache images with the appropriate anti-aliased background colour.
- Automatically create CSS sprites?
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It's great to see that this is being paid attention to. I would agree with you that the current implementation of JS2 is not what I see as ideal either.
The use of "class" loading seems a little strange to me as well - I mean, there's not really such thing as a class in JavaScript, nor does the class loaded only load a specific JavaScript object or function, so it's really more of a file loader - if we drop the .js from the file names in a system where some resources are MediaWiki messages who's names also end in .js, thats a purely aesthetic maneuver - I'm find either way, but let's not call it something it's not. It's a file loader.
The dependency thing is an interesting problem, but I think it could be handled more elegantly than having to define meta-information. Just an idea for a solution...
1. Other than jQuery and a MediaWiki jquery plugin, scripts can be loaded on the client in any order 2. Each script after that adds code to a queuing system provided by the MediaWiki plugin 3. Code is identified by a name and may include an optional list of the names for any dependencies. 4. When document ready happens, the queuing generates an order for execution based on given dependencies. 5. Even after document ready, the queuing system can continue it's work whenever a script is added - such that if "bar" which depends on "foo" is registered before document.ready, and then sometime well after document.ready "foo" is run using the queuing system, "bar" will be executed directly after because it's dependency has finally been met.
// Hypothetical code...
// Example of points 3 and 4 ($.run is provided by the MediaWiki jQuery plugin) $.run( 'foo', function() { /* bar code */ } ); $.run( 'bar', ['foo'], function() { /* bar code */ } ); // document on load happens, foo is executed, bar is executed
// Example of point 5 $.run( 'bar', ['foo'], function() { /* bar code */ } ); // document.ready happens .. time passes $.run( 'foo', function() { /* bar code */ } ); // bar is executed now
I think there is a clever way to merge a solution for dynamic script loading into this as well... But essentially this solves most problems already.
Ideally dynamic script loading would never be needed, as it introduces additional latency to user interaction, and no amount of spinner graphics will ever replace faster interaction. Lazy script loading however is awesome, and should be considered in these design changes. For lazy loading, we could tell $wgOut that a script being included is either to be included immediately, or after document.ready - in which case a bit-o-JavaScript could be added to the page listing which files to load - which could be acted upon after the document is ready.
Let's also try and pay attention to the issue of i18n for dynamic UI elements. So far I've been defining a long list of messages to include in a JSON object in my PHP code, then using them in my JavaScript code. Michael has some magic going on in his script loader that does some injection of message values based on their presence in the js file (not totally clear on the details there). I think once again I would like to see that we let messages required for use in JavaScript be defined in JavaScript - so something like what michael is doing seems ideal...
// Code in .js file loadMessages( ['foo', 'bar'] ); // Code in JavaScript sent to client after magic transformations made by PHP code loadMessages( { 'foo': 'Foo', 'bar': 'Bar' } );
Thus allowing us to define messages we want loaded in the JavaScript space without making additional (and very latent) calls to the server just to get some text. Even in the case of dynamic script loading, the messages of the incoming script just get added to the collection on the client. I think this is similar if not identical to what Micheal's code does.
Bottom line, meta-info about things that go on in JavaScript land being defined and dealt with in PHP land is not a good thing - and it should be avoided. The good thing is, there are all sorts of clever ways to do so.
I'm still digesting some of the other topics being brought up - there are so many good points - I'm sure I will have more input soon...
- Trevor