Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I proposed to make bug 20768 a tracking bug, so that it can be made visible what issues are to/could be considered blocking something we can make a 1.16 out of.
Let the dependency tagging begin. Users of MediaWiki trunk are encouraged to report each and every issue, so that what is known can also be resolved (eventually).
I'm calling on all volunteer coders to keep an eye on this issue and to help out fixing issues that are mentioned here.
Cheers!
Siebrand
On Tue, Sep 22, 2009 at 6:38 PM, Siebrand Mazeland s.mazeland@xs4all.nl wrote:
Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I still want to fiddle a bit with the upload stuff, particularly in the UI but possibly also in the backend, which will probably break backwards compatibility for extensions yet again. I hope to have this done as soon as possible. In what timeframe should we be thinking for a feature freeze?
Regards, Bryan
On Tue, Sep 22, 2009 at 12:49 PM, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
On Tue, Sep 22, 2009 at 6:38 PM, Siebrand Mazeland s.mazeland@xs4all.nl wrote:
Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I still want to fiddle a bit with the upload stuff, particularly in the UI but possibly also in the backend, which will probably break backwards compatibility for extensions yet again. I hope to have this done as soon as possible. In what timeframe should we be thinking for a feature freeze?
Regards, Bryan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It might be worth going ahead and going into "slush" mode. Avoid new features and sweeping changes when possible, focus primarily on bugfixing & cleanup.
-Chad
Siebrand Mazeland wrote:
Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I proposed to make bug 20768 a tracking bug, so that it can be made visible what issues are to/could be considered blocking something we can make a 1.16 out of.
Let the dependency tagging begin. Users of MediaWiki trunk are encouraged to report each and every issue, so that what is known can also be resolved (eventually).
I'm calling on all volunteer coders to keep an eye on this issue and to help out fixing issues that are mentioned here.
I've been working on a rewrite of the script loader and a reorganisation of the JS2 stuff. I'd like to delay 1.16 until that's in and tested. Brion has said that he doesn't want Michael Dale's branch merge reverted, so as far as I can see, a schedule delay is the only other way to maintain an appropriate quality.
-- Tim Starling
On 9/22/09 6:19 PM, Tim Starling wrote:
I've been working on a rewrite of the script loader and a reorganisation of the JS2 stuff. I'd like to delay 1.16 until that's in and tested. Brion has said that he doesn't want Michael Dale's branch merge reverted, so as far as I can see, a schedule delay is the only other way to maintain an appropriate quality.
Indeed -- the point of the merge was to get the new capabilities in so they'll actually get tested and cleaned up instead of rotting in a branch forever unused. :)
-- brion
On 9/22/09 6:19 PM, Tim Starling wrote:
Siebrand Mazeland wrote:
Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I proposed to make bug 20768 a tracking bug, so that it can be made visible what issues are to/could be considered blocking something we can make a 1.16 out of.
Let the dependency tagging begin. Users of MediaWiki trunk are encouraged to report each and every issue, so that what is known can also be resolved (eventually).
I'm calling on all volunteer coders to keep an eye on this issue and to help out fixing issues that are mentioned here.
I've been working on a rewrite of the script loader and a reorganisation of the JS2 stuff. I'd like to delay 1.16 until that's in and tested. Brion has said that he doesn't want Michael Dale's branch merge reverted, so as far as I can see, a schedule delay is the only other way to maintain an appropriate quality.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
If you are really doing a JS2 rewrite/reorganization, would it be possible for some of us (especially those of us who deal almost exclusively with JavaScript these days) to get a chance to ask questions/give feedback/help in general?
While I think a rewrite/reorganization could be really awesome if done right, and also that getting it right will be easier if we can get some interested parties informed/consulted.
I know that Michael Dale's work was more or less done outside of the general MediaWiki branch for the majority of it's development, and afaik it has been a work in progress for some time, so I feel that such a golden opportunity has never really come up before.
Aside from my own desire to be involved at some level, it seems fitting to have some sort of discussion at times like these so we can make sure we are making the best decisions about software before it's deployed - as making changes to deployed software is seems to often be much more difficult.
Perhaps there's a MediaWiki page, or a time on IRC, or even just continuing on this list...?
My first question is: "What are you changing and how, and what are you moving and where?"
- Trevor
I would add that I am of course open to reorganization and would happily discuss why any given decision was made ... be it trade offs with other ways of doing things or lack of time to do it differently / better.
I also add that not all the legacy support and metavid based code has been factored out. (for example for a while I supported the form based upload but now that the upload api is in place I should remove that old code) Other things like timed text support are barely supported because of lack of time. But I would want to keep the skeleton of timed text in there so once we do get around to adding timed text for video we have a basis to move forward from.
I suggest for a timely release that you strip the js2 folder and make a note that the configuration variable can not be turned on in this release. And help me identify any issues that need to be addressed for inclusion the next release?
And finally, the basic direction and feature set was proposed on this list quite some time ago and ~some~ feedback was given at the time.
I would also would echo Trevor's call for more discussion with affected parties if your proposing significant changes.
peace, --michael
Trevor Parscal wrote:
On 9/22/09 6:19 PM, Tim Starling wrote:
Siebrand Mazeland wrote:
Hola,
I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 ("Branch 1.16") and Brion was quick to respond that some issues with js2 and the new-upload stuff need to be ironed out; valid concerns, of course.
I proposed to make bug 20768 a tracking bug, so that it can be made visible what issues are to/could be considered blocking something we can make a 1.16 out of.
Let the dependency tagging begin. Users of MediaWiki trunk are encouraged to report each and every issue, so that what is known can also be resolved (eventually).
I'm calling on all volunteer coders to keep an eye on this issue and to help out fixing issues that are mentioned here.
I've been working on a rewrite of the script loader and a reorganisation of the JS2 stuff. I'd like to delay 1.16 until that's in and tested. Brion has said that he doesn't want Michael Dale's branch merge reverted, so as far as I can see, a schedule delay is the only other way to maintain an appropriate quality.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
If you are really doing a JS2 rewrite/reorganization, would it be possible for some of us (especially those of us who deal almost exclusively with JavaScript these days) to get a chance to ask questions/give feedback/help in general?
While I think a rewrite/reorganization could be really awesome if done right, and also that getting it right will be easier if we can get some interested parties informed/consulted.
I know that Michael Dale's work was more or less done outside of the general MediaWiki branch for the majority of it's development, and afaik it has been a work in progress for some time, so I feel that such a golden opportunity has never really come up before.
Aside from my own desire to be involved at some level, it seems fitting to have some sort of discussion at times like these so we can make sure we are making the best decisions about software before it's deployed - as making changes to deployed software is seems to often be much more difficult.
Perhaps there's a MediaWiki page, or a time on IRC, or even just continuing on this list...?
My first question is: "What are you changing and how, and what are you moving and where?"
- Trevor
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Trevor Parscal wrote:
If you are really doing a JS2 rewrite/reorganization, would it be possible for some of us (especially those of us who deal almost exclusively with JavaScript these days) to get a chance to ask questions/give feedback/help in general?
I've mostly been working on analysis and planning so far. I made a few false starts with the code and so ended up planning in a more detailed way than I initially intended. I've discussed various issues with the people in #mediawiki, including our resident client-side guru Splarka.
I started off working on fixing the coding style and the most glaring errors from the JS2 branch, but I soon decided that I shouldn't be putting so much effort into that when a lot of the code would have to be deleted or rewritten from scratch.
I did a survey of script loaders in other applications, to get an idea of what features would be desirable. My observations came down to the following:
* The namespacing in Google's jsapi is very nice, with everything being a member of a global "google" object. We would do well to emulate it, but migrating all JS to such a scheme is beyond the scope of the current project.
* You need to deal with CSS as well as JS. All the script loaders I looked at did that, except ours. We have a lot of CSS objects that need concatenation, and possibly minification.
* JS loading can be deferred until near the </body> or until the DOMContentLoaded event. This means that empty-cache requests will render faster. Wordpress places emphasis on this.
* Dependency tracking is useful. The idea is to request a given module, and all dependencies of that module, such as other scripts, will automatically be loaded first.
I then looked more closely at the current state of script loading in MediaWiki. I made the following observations:
* Most linked objects (styles and scripts) on a typical page view come from the Skin. If the goal is performance enhancement, then working on the skins and OutputPage has to be a priority.
* The "class" abstraction as implemented in JS2 has very little value to PHP callers. It's just as easy to use filenames. It could be made more useful with features such as dependency tracking, better concatenation and CSS support. But it seems to me that the most useful abstraction for PHP code would be for client-side modules to be multi-file, potentially with supporting PHP code for each module.
* Central registration of all client-side resources in a global variable would be onerous and should be avoided.
* Dynamic requests such as [[MediaWiki:Handheld.css]] have a large impact on site performance and need to be optimised. I'm planning a new interface, similar to action=raw, allowing these objects to be concatenated.
The following design documents are in my user space on mediawiki.org:
http://www.mediawiki.org/wiki/User:Tim_Starling/CSS_and_JS_caller_survey_(r56220) - A survey of MW functions that add CSS and JS, especially the terribly confusing situation in Skin and OutputPage
http://www.mediawiki.org/wiki/User:Tim_Starling/JS_load_order_issues_(r56220) - A breakdown of JS files by the issues that might be had in moving them to the footer or DOMContentLoaded. I favour a conservative approach, with wikibits.js and the site and user JS staying in the <head>.
http://www.mediawiki.org/wiki/User:Tim_Starling/Proposed_modularisation_of_client-side_resources - A proposed reorganisation of core scripts (Skin and OutputPage) according to the MW modules they are most associated with.
The object model I'm leaning towards on the PHP side is:
* A client-side resource manager (CSRM) class. This would be responsible for maintaining a list of client-side resources that have been requested and need to be sent to the skin. It would also handle caching, distribution of incoming dynamic requests, dependencies, minification, etc. This is quite a complex job and might need to be split up somewhat.
* A hierarchy of client-side module classes. A module object would contain a list of files, dependencies and concatenation hints. Objects would be instantiated by parent classes such as skins and special pages, and added to the CSRM. Classes could be registered globally, and then used to generate dynamic CSS and JS, such as the user preference stylesheet.
* The module base class would be non-abstract and featureful, with a constructor that accepts an array-based description. This allows simple creation of modules by classes with no interest in dynamic script generation.
* A new script loader entry point would provide an interface to registered modules.
There are some design decisions I still have to make, which are tricky due to performance tradeoffs:
* With concatenation, there is the question of which files to combine and which to leave separate. I would like to have a "combine" parameter which is a string, and files with the same combine parameter will be combined.
* Like Wordpress, we could store minified and concatenated files in a public cache and then link to that cache directly in the HTML.
* The cache invalidation scheme is tricky, there's not really an ideal system. A combination of cache-breaking parameters (like Michael's design) and short expiry times is probably the way to go. Using cache-breaking parameters alone doesn't work because there is referring HTML cached on both the server and client side, and regenerating that HTML periodically would be much more expensive than regenerating the scripts.
Here are my notes:
* Concatenation * Performance problems: * Changing inclusions. When inclusions change, whole contents has to be sent again. * BUT people don't change skins very often. * So combine=all=skin should save time for most * Expiry times have to be synchronised. Take the minimum expiry of all, and force freshness check for all. * Makes the task of squid cache purging more difficult * Defeats browser concurrency
* Performance advantages: * For dynamic requests: * Avoids MW startup time. * Avoids DoSing small servers with concurrent requests. * For all requests: * Reduces squid CPU * Removes a few RTTs for non-pipelining clients * Improves gzip compression ratio
* Combine to static file idea: * Pros: * Fast to stream out, on all systems * Doesn't break HughesNet * Cons: * Requires splitting the request into static and dynamic * Need webserver config to add Expires header and gzip
With some help from Splarka, I've determined that it would be possible to merge the requests for [[MediaWiki:Common.css]], [[MediaWiki:Skinname.css]], [[MediaWiki:Handheld.css]] and [[MediaWiki:Print.css]], using @media blocks for the last two, for a significant performance win in almost all cases.
Once the architectural issues have been fixed, the stylistic issues in both ancient JS and the merged code will have to be dealt with, for example:
* Poorly-named functions, classes, files, etc. There's a need for proper namespacing and consistency in naming style.
* Poorly-written comments
* Unnecessary use of the global namespace. The jQuery style is nice, with local functions inside an anonymous closure:
function () { function setup() { ... } addOnloadHook( setup ); }();
* Unsafe construction of HTML. This is ubiquitous in the mwEmbed directory and there will be a huge potential for XSS, as soon as user input is added. HTML construction with innerHTML can be replaced by document.createElement() or its jQuery equivalent.
* The identity crisis. The whole js2 concept encourages code which is poorly integrated with the rest of MediaWiki, and which is written without proper study of the existing code or thought to refactoring. It's like SkinTemplate except with a more pretentious name. I'd like to get rid of all instances of "js2", to move its scripts into other directories, and to remove the global variables which turn it on and off. Also the references to MetavidWiki and the mv prefixes should be fixed.
* Lack of modularisation. The proposed registration system makes it possible to have extensions which are almost entirely client-side code. A module like libClipEdit could be moved to its own extension. I see no problem with extensions depending on other extensions, the SMW extensions do this with no problems.
A few ideas for cool future features also occur to me. Once we have a system set up for generating and caching client-side resources, why not:
* Allow the user to choose a colour scheme for their wiki and automatically generate stylesheets with the appropriate colours.
* Include images in the system. Use GD to automatically generate and cache images with the appropriate anti-aliased background colour.
* Automatically create CSS sprites?
-- Tim Starling
Tim Starling <tstarling <at> wikimedia.org> writes:
- Unnecessary use of the global namespace. The jQuery style is nice,
with local functions inside an anonymous closure:
function () { function setup() { ... } addOnloadHook( setup ); }();
This would make it impossible to overwrite the function locally on a wiki, which is done sometimes, either because it conflicts with some local script, or for better localization (such as changing the sorting algorithm in the sortable-table script to handle non-ASCII characters decently). You should rather use a global MediaWiki object, that works just as well for clearing the global namespace, and it leaves the functions accessible.
Also take into account on the javascript redesign, javascript wiki-side extensions.
[[MediaWiki:Common.js]] importScripts [[MediaWiki:Wikiminiatlas.js]], [[MediaWiki:niceGalleries.js]] and [[MediaWiki:buttonForRFA.js]], which then load [[MediaWiki:buttonForRFA/lang.js]]... plus the several Gadgets the user may have enabled.
On Wikimedia Commons I load 38 scripts located at the MediaWiki namespace (plus gen=js). I'm pretty sure loading all of them when they aren't in the cache slows it much more than the organization of the core mediawiki javascript.
Transcluding in the same request files would benefit a lot (either automatically detecting calls to importScript or with a new syntax).
Finally, a dependence you may not have taken into account would be that some CSS from the shared repository should be usable by host wikis when viewing the pages.
Possibly-OFF-TOPIC-here
I see that ImageMagick can combine images in a single one.
A single image mean a single hit to a Apache, so it only have to spawn once.
On the clientside, a single image can draw multiple elements with some ninja CSS stuff. ( background-position?).
For such thing to be possible to a MediaWiki skins, do changes are needed?.
This is "minimize" but for graphics.
Is possible a idea for the future, for a future full of divs and CSS3 happynes.
Tei wrote:
Possibly-OFF-TOPIC-here
I see that ImageMagick can combine images in a single one.
A single image mean a single hit to a Apache, so it only have to spawn once.
On the clientside, a single image can draw multiple elements with some ninja CSS stuff. ( background-position?).
For such thing to be possible to a MediaWiki skins, do changes are needed?.
This is "minimize" but for graphics.
Is possible a idea for the future, for a future full of divs and CSS3 happynes.
I don't think it fits our normal image usage into the pages. Could be tried for the images used by the skins. Although I would worry about support for that CSS on legacy browsers.
Tei wrote:
Possibly-OFF-TOPIC-here
I see that ImageMagick can combine images in a single one.
A single image mean a single hit to a Apache, so it only have to spawn once.
On the clientside, a single image can draw multiple elements with some ninja CSS stuff. ( background-position?).
People have taken to calling that the "CSS sprite" technique, I mentioned it as a possibility in my original post.
http://www.alistapart.com/articles/sprites
I always thought the defining characteristic of a sprite was that it moved around the screen, not that it was copied from a grid, but there you have it.
-- Tim Starling
On Thu, Sep 24, 2009 at 4:41 AM, Tim Starling tstarling@wikimedia.org wrote:
* Removes a few RTTs for non-pipelining clients
Do you mean to imply that there's such a thing as a pipelining client on the real web? (Okay, okay, Opera.) This concern seems like it outweighs all the others put together pretty handily -- especially for script files that aren't at the end, which block page loading.
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
On Thu, Sep 24, 2009 at 9:13 AM, Platonides Platonides@gmail.com wrote:
Also take into account on the javascript redesign, javascript wiki-side extensions.
[[MediaWiki:Common.js]] importScripts [[MediaWiki:Wikiminiatlas.js]], [[MediaWiki:niceGalleries.js]] and [[MediaWiki:buttonForRFA.js]], which then load [[MediaWiki:buttonForRFA/lang.js]]... plus the several Gadgets the user may have enabled.
On Wikimedia Commons I load 38 scripts located at the MediaWiki namespace (plus gen=js). I'm pretty sure loading all of them when they aren't in the cache slows it much more than the organization of the core mediawiki javascript.
Hmm, yeah. This scheme needs to support combining admin-added JavaScript, unless we can convince everyone to just put everything in Common.css. Maybe we could support some sort of transclusion mechanism for JS files -- like rather than serving JS pages raw, MW first substitutes templates (but nothing else)?
On Thu, Sep 24, 2009 at 10:00 AM, Tei oscar.vives@gmail.com wrote:
I see that ImageMagick can combine images in a single one.
A single image mean a single hit to a Apache, so it only have to spawn once.
On the clientside, a single image can draw multiple elements with some ninja CSS stuff. ( background-position?).
For such thing to be possible to a MediaWiki skins, do changes are needed?.
This is image spriting, which Tim mentioned as a possibility. It's not a big issue for us right now because we use so few images, and images don't block page parsing or rendering, but it might be worth considering eventually.
On Thu, Sep 24, 2009 at 10:13 AM, Platonides Platonides@gmail.com wrote:
I don't think it fits our normal image usage into the pages. Could be tried for the images used by the skins. Although I would worry about support for that CSS on legacy browsers.
Image spriting is very well-studied and works in all browsers of import. It's used by all the fancy high-performance sites, like Google:
http://www.google.com/images/nav_logo7.png
It would be nice if we didn't have to go to such lengths to hack around the fact that HTTP pipelining is broken, wouldn't it?
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Aryeh Gregor Sent: 24 September 2009 15:48 To: Wikimedia developers Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On Thu, Sep 24, 2009 at 4:41 AM, Tim Starling tstarling@wikimedia.org wrote:
* Removes a few RTTs for non-pipelining clients
Do you mean to imply that there's such a thing as a pipelining client on the real web? (Okay, okay, Opera.) This concern seems like it outweighs all the others put together pretty handily -- especially for script files that aren't at the end, which block page loading.
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct a stylesheet which sets the background property of the css rules and an single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
Jared
On 9/24/09 9:31 AM, Jared Williams wrote:
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct a stylesheet which sets the background property of the css rules and an single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
I work with CSS sprites all the time, and have seen some "automated" methods to go from individual images to sprites, but it's not nearly as good of an idea as it sounds.
I will go into depth, but if you don't really care (totally understandable) just understand this.
*Automation of sprite creation and implementation is not an efficient use of time in most cases.*
First, to use sprites, you have to be in a situation where CSS "background-position " attributes are not already being used. Take for instance the CSS we use to place little icons next to links that are pointing to external URLs. Essentially we set a background image to be positioned "right center" and then move the text out of the way with "padding-right:18px". If you were to sprite this image, you could perhaps use a vertical sprite (images tiled vertically only) but then when the user adjusts the size of the text in their browser they start seeing multiple images on the right. You could add more space between the images so that the text could be pretty big before you start seeing the other icons, but how much space is enough? What limit on text-size adjustment should we declare? Does the extra space between the icons introduce a significant amount of additional data? (maybe not much with PNG compression techniques, but it does add something) In many other cases the background position in both X and Y are being used already so sprites are not a possibility at all.
To use sprites like Google does, you would need to change the HTML output to accommodate the technique. For instance you could insert a fixed sized "float:right" div as an icon at the end of the link, but then the elegant way that we apply styles to such links (rules like "a[href^=http://]") are useless... We would have to make changes to the output of the parser for purely aesthetic reasons (evil), or perform client-side DOM manipulations (requiring JavaScript to to be enabled just to see the external link icon - also evil) --- this is getting messy.
My point is not that sprites are bad, it's that they aren't always an option, and take allot of careful design of CSS, HTML and image resources to get working properly. Automating them as is starting to be proposed here includes inventing some sort of instruction set that a computer can read and assemble sprites from, but the problem is usually so complex that such a language would take much more time to invent, create parsers for, test and maintain than to just do the sprites by hand.
Automating sprite creation is still a great idea, but it needs to be done in more isolated and predictable cases like generating toolbar icons. This case is more firendly to automation because it's dealing with fixed height and width images that are always displayed in the browser at the same size no matter what. Thes files are currently stored in separate files, so merging them into a single file and generating CSS code that defines the offsets for them to be put to use using automation would be great!. However even this case has it's issues. It makes the toolbar code more complex because we have to support sprite-based images as well as non-sprite-based images (so that users can still customize the toolbar) and we have to handle the naming of the selectors of the generated CSS in some way that won't cause confusion or namespace collision.
Finally, the png or gif files that are created by things like ImageMagick are larger (in bytes) than images compressed by hand (using image manipulation software). Even pngcrush or similar utilities fail to outperform manual image compression. The reason is that images can be reduced in size, but when you do this it reduces the "quality" (fewer colors in the pallete make the image look more grainy, aggressive jpeg compression makes the image look more blocky). When performing image compression manually, you use your eyes and image processing in your brain to decide where the line should be drawn between quality and optimization - automated solutions I've used seem to either draw this line arbitrarily or error on the side of quality at the cost of optimal compression.
So - not only does the CSS and HTML need close attention when working with sprites, but the image optimization process does as well.
Again, I like sprites allot! But in reality, they are an optimization technique that needs careful attention and can cause problems if done improperly.
- Trevor
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Trevor Parscal Sent: 24 September 2009 19:38 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 9:31 AM, Jared Williams wrote:
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct a stylesheet which sets the background property of the css
rules and an
single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
Again, I like sprites allot! But in reality, they are an optimization technique that needs careful attention and can cause problems if done improperly.
Providing CSS sprite support would be (I guess) just a service for modules/extensions to use, just as a part of the proposed client resource manager(?). So the mediawiki or an extension can put in a request for a some stylesheet or javascript be linked to, it could also request for images possibly via CSS sprites.
So don't see how it should cause a problem.
Jared
On 9/24/09 1:40 PM, Jared Williams wrote:
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Trevor Parscal Sent: 24 September 2009 19:38 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 9:31 AM, Jared Williams wrote:
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct a stylesheet which sets the background property of the css
rules and an
single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
Again, I like sprites allot! But in reality, they are an optimization technique that needs careful attention and can cause problems if done improperly.
Providing CSS sprite support would be (I guess) just a service for modules/extensions to use, just as a part of the proposed client resource manager(?). So the mediawiki or an extension can put in a request for a some stylesheet or javascript be linked to, it could also request for images possibly via CSS sprites.
So don't see how it should cause a problem.
Jared
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
So you are saying that you believe a generic set of sprite-generation utilities are going to be able to completely overcome the issues I identified and be a better use of time (to design, develop and use) than just creating and using sprites manually?
- Trevor
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Trevor Parscal Sent: 24 September 2009 21:49 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 1:40 PM, Jared Williams wrote:
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf
Of Trevor
Parscal Sent: 24 September 2009 19:38 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 9:31 AM, Jared Williams wrote:
- Automatically create CSS sprites?
That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct
a
stylesheet which sets the background property of the css
rules and an
single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
Again, I like sprites allot! But in reality, they are an
optimization
technique that needs careful attention and can cause
problems if done
improperly.
Providing CSS sprite support would be (I guess) just a service for
modules/extensions to use, just as a part of the proposed client resource manager(?). So the mediawiki or an extension can put in a
request for a some stylesheet or javascript be linked to, it could
also request for images possibly via CSS sprites.
So don't see how it should cause a problem.
Jared
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
So you are saying that you believe a generic set of sprite-generation utilities are going to be able to completely overcome the issues I identified and be a better use of time (to design, develop and use) than just creating and using sprites manually?
- Trevor
I wouldn't say there a issues with CSS sprites, but there are limitations which you have to be aware of before deciding on using them, and therefore do not need overcoming.
In the context of providing toolbar imagery for UIs like a WYSIWYG editor, or for playing video, audio, or for simple image editing, they can remove a lot of round triping.
Jared
On 9/24/09 2:34 PM, Jared Williams wrote:
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Trevor Parscal Sent: 24 September 2009 21:49 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 1:40 PM, Jared Williams wrote:
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf
Of Trevor
Parscal Sent: 24 September 2009 19:38 To: wikitech-l@lists.wikimedia.org Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On 9/24/09 9:31 AM, Jared Williams wrote:
> * Automatically create CSS sprites? > > > That would be neat, but perhaps a bit tricky.
Just trying to think how it'd work.
Given a CSS selector, and an image, should be able to construct
a
stylesheet which sets the background property of the css
rules and an
single image.
(#toolbar-copy, toolbar-copy.png) (#toolbar-copy:hover, toolbar-copy-hover.png)
And the generated stylesheet would get concatenated with other stylesheets.
Again, I like sprites allot! But in reality, they are an
optimization
technique that needs careful attention and can cause
problems if done
improperly.
Providing CSS sprite support would be (I guess) just a service for
modules/extensions to use, just as a part of the proposed client resource manager(?). So the mediawiki or an extension can put in a
request for a some stylesheet or javascript be linked to, it could
also request for images possibly via CSS sprites.
So don't see how it should cause a problem.
Jared
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
So you are saying that you believe a generic set of sprite-generation utilities are going to be able to completely overcome the issues I identified and be a better use of time (to design, develop and use) than just creating and using sprites manually?
- Trevor
I wouldn't say there a issues with CSS sprites, but there are limitations which you have to be aware of before deciding on using them, and therefore do not need overcoming.
In the context of providing toolbar imagery for UIs like a WYSIWYG editor, or for playing video, audio, or for simple image editing, they can remove a lot of round triping.
Jared
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Sounds like we agree. The issues weren't issues with sprites, they were issues with automating them.
- Trevor
Aryeh Gregor wrote:
On Thu, Sep 24, 2009 at 4:41 AM, Tim Starling tstarling@wikimedia.org wrote:
* Removes a few RTTs for non-pipelining clients
Do you mean to imply that there's such a thing as a pipelining client on the real web? (Okay, okay, Opera.) This concern seems like it outweighs all the others put together pretty handily -- especially for script files that aren't at the end, which block page loading.
It's not really as simple as that. The major browsers use concurrency as a substitute for pipelining. Instead of queueing up multiple requests in a single TCP connection and then waiting, they queue up multiple requests in multiple connections and then wait. The effect is very similar in terms of RTTs.
By concatenating, you eliminate concurrency in the browser. The effect of this could actually be to make the initial page view slower, despite the increased TCP window size at the end of the concatenated request. The net performance impact would depend on all sorts of factors, but you can see that the concurrent case would be faster when the RTT is very long, the number of objects is large, the number of connections is equally large, and the unmerged object size is slightly smaller than the initial TCP window.
In a default install, it's not harmful to concatenate the [[MediaWiki:*.css]] pages regardless of network distance, because the pages are so small that even the merged object will fit in the initial TCP window.
There is a potential reduction in RTT count due to concatenation, that's why I included that item on the list. But it's client-dependent and might not exist at all in the most common case. That's why I'm focusing on other benefits of concatenation to justify why I'm doing it.
-- Tim Starling
On Thu, Sep 24, 2009 at 12:49 PM, Tim Starling tstarling@wikimedia.org wrote:
It's not really as simple as that. The major browsers use concurrency as a substitute for pipelining. Instead of queueing up multiple requests in a single TCP connection and then waiting, they queue up multiple requests in multiple connections and then wait. The effect is very similar in terms of RTTs.
Except that even on a page with 30 or 40 includes, the number of concurrent requests will typically be something like 4 or 8, so RTT becomes a huge issue if you have lots of includes. Not to mention that most browsers before very recently won't do concurrency at all for scripts -- script loads block parsing, so no new requests start when a script is still loading or executing. If you're talking about cutting four includes down to one, then maybe the benefit would be insignificant or even negative, but if you're talking about cutting 30 includes down to ten, AFAIK the benefit just from RTT should swamp all other considerations. This is why Yahoo!'s #1 rule for good front-end performance is "Minimize HTTP Requests":
http://developer.yahoo.com/performance/rules.html
you can see that the concurrent case would be faster when the RTT is very long, the number of objects is large, the number of connections is equally large
This last point is the major failure here. If browsers really requested everything in parallel, then we wouldn't need any of these hacks -- not combining, not spriting. But they don't, they request very few things in parallel.
There is a potential reduction in RTT count due to concatenation, that's why I included that item on the list. But it's client-dependent and might not exist at all in the most common case.
AFAIK this is not true in practice.
~some comments inline~
Tim Starling wrote:
[snip]
I started off working on fixing the coding style and the most glaring errors from the JS2 branch, but I soon decided that I shouldn't be putting so much effort into that when a lot of the code would have to be deleted or rewritten from scratch.
I agree there are some core components that should be separated out and re-factored. And some core pieces that your probably focused on do need to be removed & rewritten as they are aged quite a bit. (parts of mv_embed.js where created in SOC 06) ... I did not focus on the ~best~ core loader that could have been created I have just built on what I already had available that has "worked" reasonably well for the application set that I was targeting. Its been an iterative process which I feel is moving in the right direction as I will outline below.
Obviously more input is helpful and I am open to implementing most of the changes you describe as they make sense. But exclusion and dismissal may not be less helpful... unless that is your targeted end in which case just say so ;)
Its normal for 3rd party observer to say the whole system should be scraped and rewritten. Of course starting from scratch is much easier to design an ideal system and what it should/could be.
I did a survey of script loaders in other applications, to get an idea of what features would be desirable. My observations came down to the following:
- The namespacing in Google's jsapi is very nice, with everything
being a member of a global "google" object. We would do well to emulate it, but migrating all JS to such a scheme is beyond the scope of the current project.
You somewhat contradict this approach by recommending against "class" abstraction below.. ie how will you cleanly load components and dependencies if not by a given name?
I agree we should move things into a global object ie: $j and all our components / features should extend that object. (like jquery plugins). That is the direction we are already going.
Dependency loading is not really beyond the scope... we are already supporting that. If you check out the mv_jqueryBindings function in mv_embed.js ... here we have loader calls integrated into the jquery binding. This integrates loading the high level application interfaces into their interface call.
The idea is to move more and more of the structure of the application into that system. so right now mwLoad is a global function but should be re-factored into the jquery space and be called via $j.load(); | |
- You need to deal with CSS as well as JS. All the script loaders I
looked at did that, except ours. We have a lot of CSS objects that need concatenation, and possibly minification.
Brion did not set that as high priority when I inquired about it, but of course we should add in style grouping as well. It's not like I said we should exclude that in our script-loader just a matter of setting priority which I agree is high priority.
- JS loading can be deferred until near the </body> or until the
DOMContentLoaded event. This means that empty-cache requests will render faster. Wordpress places emphasis on this.
true. I agree that we should put the script includes at the bottom. Also all non-core js2 scripts is already loaded via DOMContentLoaded ready event. Ideally we should only provide "loaders" and maybe some small bit of configuration for the client side applications they provide. As briefly described here: http://www.mediawiki.org/wiki/JS2_Overview#How_to_structure_your_JavaScript_...
- Dependency tracking is useful. The idea is to request a given
module, and all dependencies of that module, such as other scripts, will automatically be loaded first.
As mentioned above we do some dependency tracking via binding jquery helpers that do that setup internally on a per application interface level. We could add that convention directly into the script-loader function if desired so that on a per class level we include dependencies. Like mwLoad('ui.dialog') would know to load ui.core etc.
I then looked more closely at the current state of script loading in MediaWiki. I made the following observations:
- Most linked objects (styles and scripts) on a typical page view come
from the Skin. If the goal is performance enhancement, then working on the skins and OutputPage has to be a priority.
agreed. The script-loading was more urgent for my application task set. But for the common case of per page view performance css grouping has bigger wins.
- The "class" abstraction as implemented in JS2 has very little value
to PHP callers. It's just as easy to use filenames.
The idea with "class" abstraction is that you don't know what script set you have available at any given time. Maybe one script included ui.resizable and ui.move and now your script depends on ui.resizable and ui.move and ui.drag... your loader call will only include ui.drag (since the other are already defined).
This avoids re-parse and re-including the same javascript file as part of a separate group request or src include. Alternatively you can check against including the same script when your just using raw src but a bit trickery when using scriptloader and call define checks and class/file convention is compatible with XHR getting a javascript file and evaluating the result. (which is the way some frameworks include javascript that to ensure a consistent onLoaded callback)...
Which brings us to another point about class / file bindings. It lets us test the typeof the variable it should define and then issue a callback once we definitely know that this script is loaded and ready.
The trade-off for grouping distinct class set requests is chacheability for return visit vs script reuse vs fastest display time for un-cached visit vs server resource cost. Also perhaps some scripts can always grouped while other components are rarely included individually. But that changes with application development. Combining scripts is not too costly relative to the round trip time... and we could pre-minify.
Its "optimal" to avoid the script-loader all together and just have a single small core updated file with short expire that sets the version number of each script. Then everything else could have a high expire since its tagged by version number. That would be "optimal" but a slower first load experience. And we still have to cache and package localizations per language.
I have not done a definitive evaluation of the trade offs and am open to more thoughts on that front.
It could be made more useful with features such as dependency tracking, better concatenation and CSS support. But it seems to me that the most useful abstraction for PHP code would be for client-side modules to be multi-file, potentially with supporting PHP code for each module.
We want to move away from php code dependencies for each javascript module. Javascript should just directly hit a single exposure point of the mediawiki api. If we have php code generating bits and pieces of javascript everywhere it quickly gets complicated, is difficult to maintain, much more resource intensive, and requires a whole new framework to work right.
Php's integration with the javascript should be minimal. php should supply configuration, and package in localized msgs.
- Central registration of all client-side resources in a global
variable would be onerous and should be avoided.
You can always add to the registered global. This works well by having the php read the javascript file directly to ascertain the global list. That way your javascript works stand alone as well as integrated with a script-loader that provides localization and configuration.
- Dynamic requests such as [[MediaWiki:Handheld.css]] have a large
impact on site performance and need to be optimised. I'm planning a new interface, similar to action=raw, allowing these objects to be concatenated.
Sounds good ;) The present script-loader does this for javacript and take the most recent revision number of the included pages and the grouped version to that. I think it has to be integrated into page output so you can have a long expire time.
The following design documents are in my user space on mediawiki.org:
http://www.mediawiki.org/wiki/User:Tim_Starling/CSS_and_JS_caller_survey_(r56220)
- A survey of MW functions that add CSS and JS, especially the
terribly confusing situation in Skin and OutputPage
I did a small commit r56746 to try and start to clean that up... but it is a mess.
http://www.mediawiki.org/wiki/User:Tim_Starling/JS_load_order_issues_(r56220)
- A breakdown of JS files by the issues that might be had in moving
them to the footer or DOMContentLoaded. I favour a conservative approach, with wikibits.js and the site and user JS staying in the
<head>.
A sperate somewhat related effort should be to depreciate all non-jquery style helpers. A lot of the functions in wikibits.js for example could use jquery functions or be re-factored into a few lines of jquery which may make it unnessesary to have thouse global function abstractions to begin with. I am in-favor of moving things to the bottom of the page. Likewise all new javascript should be compatible with being run at DOMContentLoaded time.
http://www.mediawiki.org/wiki/User:Tim_Starling/Proposed_modularisation_of_client-side_resources
- A proposed reorganisation of core scripts (Skin and OutputPage)
according to the MW modules they are most associated with.
The object model I'm leaning towards on the PHP side is:
- A client-side resource manager (CSRM) class. This would be
responsible for maintaining a list of client-side resources that have been requested and need to be sent to the skin. It would also handle caching, distribution of incoming dynamic requests, dependencies, minification, etc. This is quite a complex job and might need to be split up somewhat.
That sounds cleaner than the present outputPage and Skin.php and associated script-loader grafting. Having a cleaner system would be nice... but will probably break skins and other stuff... or have OutputPage and Skin old api mappings or change almost every extension and break every 3rd party skin out there?
You could probably have something "working" fairly quickly the trick is compatibility with the broken old system. It is a core issue and people working on other projects have added on the functionality needed to "get it working" with existing stuff ... If you want to clean it up I don't think anyone will protest as long as it does not take away features or require major reworking of other code.
- A hierarchy of client-side module classes. A module object would
contain a list of files, dependencies and concatenation hints. Objects would be instantiated by parent classes such as skins and special pages, and added to the CSRM. Classes could be registered globally, and then used to generate dynamic CSS and JS, such as the user preference stylesheet.
The main problem of defining all the objects and hierarchy relationships in php is that it won't work stand alone. An ideal system retains flexibility in being able to work with the script loader or without it. Ultimately your javascript code will dictate what class is required when and where. If you have to go back to php to define this all the time that won't be fun.
Additionally how do you describe call chains that happen purely in JS. Say you do a search to insert an image then you decide you want to look for video now we load a video clip. The serer can't map out that the client needs native handler to be packaged with the javascript instead of the cortado video handler. We have to run the detection client side then get the code. The server could know that if you request cortado handler you also need the parent video object, but it seems cleaner to map out that dependency in javascript instead of php side. Then say now you want to run the code without the script-loader it won't work at all.
- The module base class would be non-abstract and featureful, with a
constructor that accepts an array-based description. This allows simple creation of modules by classes with no interest in dynamic script generation.
What are you planning on including in this array beside the path to the javascript file? Again it will suck for the javascript author to go back into php and define all the dependencies instead of just listing them as needed in the js. Furthermore how will this work with scripts in the mediaWiki namespace. How will they define classes and decencies they need if not in the javascript?
I think the php should read the javascript for this information as its presently done with the script loader.
- A new script loader entry point would provide an interface to
registered modules.
The scirptloader is already defined as part of the javascript Loader so the name of the entry point does not matter so much as the calling conventions.
There are some design decisions I still have to make, which are tricky due to performance tradeoffs:
- With concatenation, there is the question of which files to combine
and which to leave separate. I would like to have a "combine" parameter which is a string, and files with the same combine parameter will be combined.
right... see discussion above. I think in practice ad-hock grouping via post page load javascript interface requests will naturally group and cache together common requests by nature of consistent javascript application flow. So I don't think the concatenation "hit" will be that substantial. Javascript grouped at the page-loading level will of course want to try and avoid grouping something that will later be included by-its-self a separate page.
- Like Wordpress, we could store minified and concatenated files in a
public cache and then link to that cache directly in the HTML.
That seems perfectly reasonable... Is the idea that this will help small sites that don't have things behind a squid proxy? Although small sites seem to work oky with mediaWiki pages being served via php reading cached files.
- The cache invalidation scheme is tricky, there's not really an ideal
system. A combination of cache-breaking parameters (like Michael's design) and short expiry times is probably the way to go. Using cache-breaking parameters alone doesn't work because there is referring HTML cached on both the server and client side, and regenerating that HTML periodically would be much more expensive than regenerating the scripts.
An option is to write out a bit of dynamic javascript to a single low expire static cached core script that sets the versions for everything that could be included. But that does not work well with live hacks. (hence the checking of filemodified date) ... If version updates are generally highly correlated with localization updates anyway... I don't see too much problem with old javascript persisting until a page is purged and rendered with the new interface.
I don't see benefit in hurting our cache rate to support ~new javascript~ with ~old html~
New javascript could depend on new html no? (like an added configuration variable)? or new div element? You could add that level of complexity to the CSRM concept ... or just tie javascript to a given html page. (This reuses the cached javascript if the javascript has not been updated... at the cost of re-rendering the html as is done with other updates.
Here are my notes:
- Concatenation
- Performance problems:
- Changing inclusions. When inclusions change, whole contents has
to be sent again. * BUT people don't change skins very often. * So combine=all=skin should save time for most * Expiry times have to be synchronised. Take the minimum expiry of all, and force freshness check for all. * Makes the task of squid cache purging more difficult * Defeats browser concurrency
Performance advantages:
- For dynamic requests:
- Avoids MW startup time.
- Avoids DoSing small servers with concurrent requests.
- For all requests:
- Reduces squid CPU
- Removes a few RTTs for non-pipelining clients
- Improves gzip compression ratio
Combine to static file idea:
- Pros:
- Fast to stream out, on all systems
- Doesn't break HughesNet
- Cons:
- Requires splitting the request into static and dynamic
- Need webserver config to add Expires header and gzip
We could support both if we build the logic into the js as done with the present system. The present scirpt-loader works both by feeding the loader info from the javascript files. (although does not send the client to cached group requests if the script-loader is off). But a simple addition of a maintenance script could output the combined scripts sets into a public dir based on loader set defections from the js.
With some help from Splarka, I've determined that it would be possible to merge the requests for [[MediaWiki:Common.css]], [[MediaWiki:Skinname.css]], [[MediaWiki:Handheld.css]] and [[MediaWiki:Print.css]], using @media blocks for the last two, for a significant performance win in almost all cases.
sounds good.
Once the architectural issues have been fixed, the stylistic issues in both ancient JS and the merged code will have to be dealt with, for example:
- Poorly-named functions, classes, files, etc. There's a need for
proper namespacing and consistency in naming style.
Yea there is a bit of identity crisis based on the inherited code. But variable renaming is not too hard. Also there is transitioning in under way to go from old style to more jQuery style.
- Poorly-written comments
True. (no defense there) (expect to say that i am dyslexic)
- Unnecessary use of the global namespace. The jQuery style is nice,
with local functions inside an anonymous closure:
function () { function setup() { ... } addOnloadHook( setup ); }();
right as mentioned above I am moving in that direction see: mv_jqueryBindings(); even read the comment right above: * @@ eventually we should refactor mwCode over to jQuery style plugins * and mv_embed.js will just handle dependency mapping and loading.
- Unsafe construction of HTML. This is ubiquitous in the mwEmbed
directory and there will be a huge potential for XSS, as soon as user input is added. HTML construction with innerHTML can be replaced by document.createElement() or its jQuery equivalent.
I build a lot of html as static strings because its faster than generating every element with function calls. If you can inject arbitrary content into some javscript string then I imagine you can do so with the createElement as well. You don't gain much escaping already defined javascript. If you do something to get some value into some one elses JavaScript instance then you might as well call your evilJs directly. Perhaps I am understanding this wrong? Could you illustrate how that would be exploited in one case but not the other?
- The identity crisis. The whole js2 concept encourages code which is
poorly integrated with the rest of MediaWiki, and which is written without proper study of the existing code or thought to refactoring. It's like SkinTemplate except with a more pretentious name. I'd like to get rid of all instances of "js2", to move its scripts into other directories, and to remove the global variables which turn it on and off. Also the references to MetavidWiki and the mv prefixes should be fixed.
Yes being "stand alone" is a primary "feature" of the concept ... The whole mwEmbed system can "stand alone" that will enable us to easily share interface components with other CMS or platforms. This enables us share things like the add-media-wizard with a blogs that wants to insert a asset from commons or a set of free licensed repositories. It enables 3rd parties to remote embed video clips and do mash-ups with the timed text and mediaWIki api calls. Or just use the firefogg encoder as a stand alone application. and or use any edit tools we integrate for image / audio / video manipulation.
You can compare it to the Google api thing you mentioned early on... its very convenient to do a single load call and get everything you need from the google application interfaces. The api is one level of supporting external integrations. An application level interfaces for external applications is another level that holds interesting possibilities in my mind. But is a fundamentally new direction for mediaWiki.
- Lack of modularisation. The proposed registration system makes it
possible to have extensions which are almost entirely client-side code. A module like libClipEdit could be moved to its own extension. I see no problem with extensions depending on other extensions, the SMW extensions do this with no problems.
I am not entirely against extension based modularization and we definitely need to support it for extensions that depend on php code.
But its nice to be able to pull any part of the application form any point. For example in the add-media-wizard for the description of assets I will want to pull the wikiEditor to support formating in the description of the imported asset. It sucks to have to check if a component is available all the time.
Imagine the sequencer that depends on pretty much everything in the mwEmebed directory. For it to resolve all its dependencies across 1/2 dozen extensions and "versions of extensions" in different locations will not be fun.
And of-course will have to build a separate packaging system for the application to work as a stand alone tool.
Making it near impossible to test any component stand alone since it will be dependent on the mediaWiki framework to get up and running. Testing components stand alone has been very valuable.
A single client side code repository can help ensures consistency of included modules. Ie we won't have multiple versions of jquery, jquery ui, or any other reusable component that is used across multiple interfaces and conflicting in our loading system. (presently we have a lot of copies of jquery and its plugins in extensions for example)
If this is the ultimate blocker in your mind I could restructure things as scatted across extensions. Its not entirety painfully to re-factor that way since everything is loaded via js script loader helpers but the above mentioned issues would be a bummer.
I prefer if we have a concept of the javascirpt components/folders within the mwEmbed folder being "client side modules" as different from php code so it does not need to be tied to php code extension. Moving directories around won't inherently improve "modularity". Perhaps we need a way to just include portions of the javascript set?... We can always strip folders in releases. Perhaps it should be moved to a separate directory and only parts of it copied over at deployment time?
A few ideas for cool future features also occur to me. Once we have a system set up for generating and caching client-side resources, why not:
- Allow the user to choose a colour scheme for their wiki and
automatically generate stylesheets with the appropriate colours.
- Include images in the system. Use GD to automatically generate and
cache images with the appropriate anti-aliased background colour.
- Automatically create CSS sprites?
Don't forget about localization packing which was a primary motivation for the script-loader to begin with ;)
peace, --michael
- The namespacing in Google's jsapi is very nice, with everything
being a member of a global "google" object. We would do well to emulate it, but migrating all JS to such a scheme is beyond the scope of the current project.
You somewhat contradict this approach by recommending against "class" abstraction below.. ie how will you cleanly load components and dependencies if not by a given name?
By module name. Each module can contain multiple files. I don't see any problem with allowing anonymous modules, as long as the caller is happy with the fact that such modules can't be used in dependencies or loaded on demand on the client side.
I agree we should move things into a global object ie: $j and all our components / features should extend that object. (like jquery plugins). That is the direction we are already going.
I think it would be better if jQuery was called window.jQuery and MediaWiki was called window.mw. Then we could share the jQuery instance with JS code that's not aware of MediaWiki, and we wouldn't need to worry about namespace conflicts between third-party jQuery plugins and MediaWiki.
Dependency loading is not really beyond the scope... we are already supporting that. If you check out the mv_jqueryBindings function in mv_embed.js ... here we have loader calls integrated into the jquery binding. This integrates loading the high level application interfaces into their interface call.
Your so-called dependency functions (e.g. doLoadDepMode) just seemed to be a batch load feature, there was no actual dependency handling. Every caller was required to list the dependencies for the classes it was loading.
The idea is to move more and more of the structure of the application into that system. so right now mwLoad is a global function but should be re-factored into the jquery space and be called via $j.load(); | |
That would work well until jQuery introduced its own script-loader plugin with the same name and some extension needed to use it.
[...]
We could add that convention directly into the script-loader function if desired so that on a per class level we include dependencies. Like mwLoad('ui.dialog') would know to load ui.core etc.
Yes, that is what real dependency handling would do.
- The "class" abstraction as implemented in JS2 has very little value
to PHP callers. It's just as easy to use filenames.
The idea with "class" abstraction is that you don't know what script set you have available at any given time. Maybe one script included ui.resizable and ui.move and now your script depends on ui.resizable and ui.move and ui.drag... your loader call will only include ui.drag (since the other are already defined).
I think you're missing the point. I'm saying it doesn't provide enough features. I want to add more, not take away some.
You can remove duplicates by filename.
[...]
We want to move away from php code dependencies for each javascript module. Javascript should just directly hit a single exposure point of the mediawiki api. If we have php code generating bits and pieces of javascript everywhere it quickly gets complicated, is difficult to maintain, much more resource intensive, and requires a whole new framework to work right.
Php's integration with the javascript should be minimal. php should supply configuration, and package in localized msgs.
I don't think it will be too complicated or resource intensive. JS generation in PHP is very flexible and you admit that there is a role for it. I don't think there's a problem with adding a few more features on the PHP side.
If necessary, we can split it back out to a non-MediaWiki standalone mode by generating some static JS.
What is your reason for saying this? Have you worked on some other framework where integration of PHP and JavaScript has caused problems?
- Central registration of all client-side resources in a global
variable would be onerous and should be avoided.
You can always add to the registered global. This works well by having the php read the javascript file directly to ascertain the global list. That way your javascript works stand alone as well as integrated with a script-loader that provides localization and configuration.
There's a significant CPU cost to loading and parsing JS files on every PHP request. I want to remove that behaviour. Instead, we can list client-side files in PHP. Then from the PHP list, we can generate static JS files in order to recover the standalone functionality.
[...]
That sounds cleaner than the present outputPage and Skin.php and associated script-loader grafting. Having a cleaner system would be nice... but will probably break skins and other stuff... or have OutputPage and Skin old api mappings or change almost every extension and break every 3rd party skin out there?
I think I'll probably break most third-party skins, if they have PHP code. We break them with just about every major release so there won't be much surprise there.
On this point, I think we need: * Easier management of non-PHP skins (i.e. CSS and images only) * Automated CSS generation (per original post) * Easier ways to modify the document structure, with less PHP involved. XSLT? * An interface in PHP that we can live with, so we don't feel obliged to keep breaking it.
I should be able to retain compatibility with non-skin extensions, and I won't break interfaces unnecessarily. But we're committed to an incremental development process, rather than a sequence of rewrites, and that means that some interfaces will get old and die within the 1.X.0 sequence.
[...]
- Like Wordpress, we could store minified and concatenated files in a
public cache and then link to that cache directly in the HTML.
That seems perfectly reasonable... Is the idea that this will help small sites that don't have things behind a squid proxy?
Yes, and it also benefits Wikimedia.
Although small sites seem to work oky with mediaWiki pages being served via php reading cached files.
Have you looked at the profiling? On the Wikimedia app servers, even the simplest MW request takes 23ms, and gen=js takes 46ms. A static file like wikibits.js takes around 0.5ms. And that's with APC. You say MW on small sites is OK, I think it's slow and resource-intensive.
That's not to say I'm sold on the idea of a static file cache, it brings its own problems, which I listed.
- The cache invalidation scheme is tricky, there's not really an ideal
system. A combination of cache-breaking parameters (like Michael's design) and short expiry times is probably the way to go. Using cache-breaking parameters alone doesn't work because there is referring HTML cached on both the server and client side, and regenerating that HTML periodically would be much more expensive than regenerating the scripts.
An option is to write out a bit of dynamic javascript to a single low expire static cached core script that sets the versions for everything that could be included. But that does not work well with live hacks. (hence the checking of filemodified date) ... If version updates are generally highly correlated with localization updates anyway... I don't see too much problem with old javascript persisting until a page is purged and rendered with the new interface.
I don't see benefit in hurting our cache rate to support ~new javascript~ with ~old html~
The performance impact of refreshing a common file once every hour or two is not large. Your code sets the expiry time to a year, and changes the urid parameter regularly, which sounds great until you accidentally cache some buggy JS into squid and you have no way to reconstruct the URID parameters and thus purge the object. Then you'd be stuck with the choice of either waiting a month for all the referring HTML to expire, or clearing the entire squid cache.
If there's a need for the versions of the HTML and JS to match, that should be handled rigorously, with old versions retained at the origin server, instead of relying on squid to keep a record of every object it's served.
[...]
- Unsafe construction of HTML. This is ubiquitous in the mwEmbed
directory and there will be a huge potential for XSS, as soon as user input is added. HTML construction with innerHTML can be replaced by document.createElement() or its jQuery equivalent.
I build a lot of html as static strings because its faster than generating every element with function calls. If you can inject arbitrary content into some javscript string then I imagine you can do so with the createElement as well. You don't gain much escaping already defined javascript. If you do something to get some value into some one elses JavaScript instance then you might as well call your evilJs directly. Perhaps I am understanding this wrong? Could you illustrate how that would be exploited in one case but not the other?
Say if MediaWiki emits an input box with a properly escaped attribute derived from user input
<input type="text" id="filename" value="<iframe src="http://example.com/"/>"/>
Then consider JS code such as:
dialog.innerHTML = "<div>" + document.getElementById( 'filename' ).value + "</div>";
This unescapes the value attribute, and puts the contents straight into HTML. The iframe will be created. This is a security vulnerability.
The alternative style used by jQuery UI is:
$j('<div/>') .text( $j('#filename')[0].value) ) .appendTo(dialog);
Or equivalently in plain DOM:
var div = document.createElement( 'div' ); div.appendChild( document.createTextNode( document.getElementById( 'filename' ).value ) ); dialog.appendChild( div );
The single text node contains the same literal text that was in the input box, no iframe element is created. You could think of it as implicit escaping, since if you ask for HTML back:
alert( dialog.innerHTML )
The browser will show you properly escaped HTML:
<div><iframe src"http://example.com/%22/%3E%3C/div%3E;
In OggHandler, I found that it was necessary to use innerHTML in some cases, because there were bugs involved with creating a Java applet and then changing its attributes. But I made sure that all the HTML I created was properly escaped, so that there was no possibility of arbitrary HTML being created, either from trusted or untrusted input.
It's best to escape even trusted input, for two reasons: * Correctness. Even trusted input can contain quotation marks. * Ease of review. Reviewers should not have to determine which of your inputs are trusted and which are untrusted in order to verify the safety of the code.
There's more on ease of review and other security issues in my article on the subject:
http://www.mediawiki.org/wiki/Security_for_developers
Security takes precedence over performance. There are better ways to improve performance than to open up your code to systematic exploit by malicious parties.
-- Tim Starling
On Fri, Sep 25, 2009 at 1:19 AM, Tim Starling tstarling@wikimedia.org wrote:
On this point, I think we need:
- Easier management of non-PHP skins (i.e. CSS and images only)
- Automated CSS generation (per original post)
- Easier ways to modify the document structure, with less PHP
involved. XSLT?
- An interface in PHP that we can live with, so we don't feel obliged
to keep breaking it.
XSLT is a non-starter unless we want fatal errors (or at least the skin completely breaking) on pages where we emit malformed XML. And there always have been some of those, and probably always will be. Probably even more significantly, XSLT is a programming language and a rather obscure one. If we're going to make MediaWiki skins so hard to make, we may as well stick with just requiring that they be in PHP.
The standard way to handle skinning in web apps, AFAICT, is to chop the interface up into templates, and stitch them together at runtime. Then skinners can modify the templates one by one, and on upgrade they only have to merge changes for the templates they've changed. Which is still a huge pain for even moderate customizations, as I can attest from personal experience. But it has the advantage that skinners only need to modify HTML and CSS, not PHP or XSLT or whatnot.
As it happens, most of the essential differences between skins can be reproduced using only CSS, if you know enough CSS. I once personally wrote, in about an hour, some CSS that made Monobook look almost pixel-for-pixel identical to Modern, with no HTML changes. The only problem is I didn't bother fixing IE, so it wasn't committable. I don't think almost any reskin should need to change the HTML at all, except maybe to add classes and such (which can be done in core). It should only be necessary if you really want to change how the interface behaves somehow (like having extra buttons), rather than just how it looks.
Aryeh Gregor wrote:
On Fri, Sep 25, 2009 at 1:19 AM, Tim Starling tstarling@wikimedia.org wrote:
On this point, I think we need:
- Easier management of non-PHP skins (i.e. CSS and images only)
- Automated CSS generation (per original post)
- Easier ways to modify the document structure, with less PHP
involved. XSLT?
- An interface in PHP that we can live with, so we don't feel obliged
to keep breaking it.
XSLT is a non-starter unless we want fatal errors (or at least the skin completely breaking) on pages where we emit malformed XML. And there always have been some of those, and probably always will be. Probably even more significantly, XSLT is a programming language and a rather obscure one. If we're going to make MediaWiki skins so hard to make, we may as well stick with just requiring that they be in PHP.
I think it makes sense to provide some way to modify the DOM after the base skin is finished making HTML. Some things can be done with CSS, but you don't want to be making heavy use of #id:after{content:"..."} to add in some advertising or analytics HTML. And some modifications are quite arcane, like reordering boxes by switching them from ordinary floats to carefully constructed absolute positioning.
You can do DOM manipulation in PHP, I just thought that using a more restricted language might help avoid some of the migration issues that keep coming up.
The standard way to handle skinning in web apps, AFAICT, is to chop the interface up into templates, and stitch them together at runtime. Then skinners can modify the templates one by one, and on upgrade they only have to merge changes for the templates they've changed. Which is still a huge pain for even moderate customizations, as I can attest from personal experience. But it has the advantage that skinners only need to modify HTML and CSS, not PHP or XSLT or whatnot.
The template engine libraries are slow, and PHP with embedded HTML (like MonoBook) leads to code which is scary from a security perspective due to the difficulty of reviewing the many echo statements. And it doesn't solve the problem, because you end up with migration issues when you need to add more items to the output or change the existing items in some fundamental way.
I mentioned the fact that Wordpress accelerates loading by moving scripts to the bottom of the page, I didn't mention that it only works for properly maintained skins since many Wordpress skins don't call the correct footer function.
-- Tim Starling
On Fri, Sep 25, 2009 at 11:41 AM, Tim Starling tstarling@wikimedia.org wrote:
I think it makes sense to provide some way to modify the DOM after the base skin is finished making HTML. Some things can be done with CSS, but you don't want to be making heavy use of #id:after{content:"..."} to add in some advertising or analytics HTML.
Adding content is no problem. Just provide a bunch of places where arbitrary HTML can be injected by configuration. The particular cases of Analytics and ads should be cross-skin anyway, and currently you'd be best off doing them using hooks (that's how I do Analytics on my wiki). What are use-cases for *skins* being able to alter the HTML output, at anywhere near the level of precision provided by XSLT?
And some modifications are quite arcane, like reordering boxes by switching them from ordinary floats to carefully constructed absolute positioning.
That's true, yes. Later versions of CSS look like they'll provide saner ways to do things, but we're a ways off from being able to use any of those yet. (The advanced positioning stuff in CSS3 isn't even close to finished AFAIK, let alone widely implemented.)
The template engine libraries are slow, and PHP with embedded HTML (like MonoBook) leads to code which is scary from a security perspective due to the difficulty of reviewing the many echo statements. And it doesn't solve the problem, because you end up with migration issues when you need to add more items to the output or change the existing items in some fundamental way.
I don't think there's any way to entirely avoid migration issues. You'd have migration issues with XSLT too, the same way we have JavaScript that breaks when we add a wrapper div or reorder some things. The best you can do is localize the damage, so things only break if they changed that exact bit of HTML.
On Fri, 25 Sep 2009 09:48:04 -0400, Aryeh Gregor wrote:
On Fri, Sep 25, 2009 at 1:19 AM, Tim Starling tstarling@wikimedia.org wrote:
On this point, I think we need:
- Easier management of non-PHP skins (i.e. CSS and images only) *
Automated CSS generation (per original post) * Easier ways to modify the document structure, with less PHP involved. XSLT?
- An interface in PHP that we can live with, so we don't feel obliged to
keep breaking it.
XSLT is a non-starter unless we want fatal errors (or at least the skin completely breaking) on pages where we emit malformed XML. And there always have been some of those, and probably always will be. Probably even more significantly, XSLT is a programming language and a rather obscure one. If we're going to make MediaWiki skins so hard to make, we may as well stick with just requiring that they be in PHP.
I'm not sure that's entirely accurate. XSLT works on DOM trees, so malformed XML shouldn't really apply. Of course, the standard command line processors create this tree with a standard parser, usually an XML parser. But in PHP, creating the DOM with a parser and transforming it with XSLT are handled separately.
On Fri, Sep 25, 2009 at 3:46 PM, Steve Sanbeg ssanbeg@ask.com wrote:
I'm not sure that's entirely accurate. Â XSLT works on DOM trees, so malformed XML shouldn't really apply. Â Of course, the standard command line processors create this tree with a standard parser, usually an XML parser. Â But in PHP, creating the DOM with a parser and transforming it with XSLT are handled separately.
Interesting. In that case, theoretically, you could use an HTML5 parser, which is guaranteed to *always* produce a DOM even on random garbage input (much like wikitext!). Now, who's up for writing an HTML5 parser in PHP whose performance is acceptable? I thought not. :P
Anyway, my other points (e.g., may as well use PHP instead if you want that much power) still hold.
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Aryeh Gregor Sent: 25 September 2009 23:01 To: Wikimedia developers Subject: Re: [Wikitech-l] JS2 design (was Re: Working towards branchingMediaWiki 1.16)
On Fri, Sep 25, 2009 at 3:46 PM, Steve Sanbeg ssanbeg@ask.com
wrote:
I'm not sure that's entirely accurate. Â XSLT works on DOM trees,
so
malformed XML shouldn't really apply. Â Of course, the
standard command
line processors create this tree with a standard parser, usually
an
XML parser. Â But in PHP, creating the DOM with a parser and transforming it with XSLT are handled separately.
Interesting. In that case, theoretically, you could use an HTML5 parser, which is guaranteed to *always* produce a DOM even on random garbage input (much like wikitext!). Now, who's up for writing an HTML5 parser in PHP whose performance is acceptable? I thought not. :P
libxml2, and therefore PHP has a tag soup HTML 4 parser.
DOMDocument::loadHTML()
http://xmlsoft.org/html/libxml-HTMLparser.html
Jared
thanks for the constructive response :) ... comments inline
Tim Starling wrote:
I agree we should move things into a global object ie: $j and all our components / features should extend that object. (like jquery plugins). That is the direction we are already going.
I think it would be better if jQuery was called window.jQuery and MediaWiki was called window.mw. Then we could share the jQuery instance with JS code that's not aware of MediaWiki, and we wouldn't need to worry about namespace conflicts between third-party jQuery plugins and MediaWiki.
Right but there are benefits to connecting into the jQuery plugin system that would not be as clean to wrap into our window.mw object. For example $('#textbox').wikiEditor() is using jQuery selectors for the target, and maybe other jQuery plugin conventions like the jquery class alias inside the function(){})(jQuery);
Although if not designing your tool as a jQuery plugin then yea ;) ... but I think most of the tools should be designed as jQuery plug-ins.
Dependency loading is not really beyond the scope... we are already supporting that. If you check out the mv_jqueryBindings function in mv_embed.js ... here we have loader calls integrated into the jquery binding. This integrates loading the high level application interfaces into their interface call.
Your so-called dependency functions (e.g. doLoadDepMode) just seemed to be a batch load feature, there was no actual dependency handling. Every caller was required to list the dependencies for the classes it was loading.
I was referring to defining the dependencies in the module call ... ie $j('target').addMediaWiz( config ) and having the addMediaWiz module map out the dependencies in the javascript. doLoadDepMode just lets you get around an IE bug that when inserting scripts via the dom you have no gurantee one script will execute in the order inserted. If you your conncatinaging your scripts doLoadDepMode would not be needed as order will be preserved in the concatenated file.
I like mapping out the dependencies in javascript at that module level since it makes it easier to do custom things like read the passed in configuration and decide which dependencies you need to fulfill. If not you have to define many dependency sets in php or have much more detailed model of your javscript inside php.
But I do understand that it will eventually result in lots of extra javascript module definitions that the given installation may not want. So perhaps we generate that module definition via php configuration ... or we define the set of javascript files to include that define the various module loaders we want with a given configuration.
This is sort the approach taken with the wikiEditor that has a few thin javascript files that make calls to add modules (like add-sidebar) to a core component (wikiEditor). That way the feature set can be controlled by the php configuration while retaining runtime flexibility for dependence mapping.
The idea is to move more and more of the structure of the application into that system. so right now mwLoad is a global function but should be re-factored into the jquery space and be called via $j.load(); | |
That would work well until jQuery introduced its own script-loader plugin with the same name and some extension needed to use it.
That is part of the idea of centrally hosting reusable client-side components so we control the jquery version and plugin set. So a new version won't "come along" until its been tested and integrated.
If the function does mediawiki specifc scriptloader load stuff then yea it should be called mwLoad or what not. If some other plugin or native jquery piece comes along we can just have our plugin override it and or store the native as a parent (if its of use) ... if that ever happens...
We could add that convention directly into the script-loader function if desired so that on a per class level we include dependencies. Like mwLoad('ui.dialog') would know to load ui.core etc.
Yes, that is what real dependency handling would do.
Thinking about this more ... I think its a bad idea to exclusively put the dependency mapping in php. It will be difficult to avoid re-including the same things in client side loading chains. Say you have your suggest search system once the user starts typing we load jquery.suggest it knows that it needs jquery ui via dependency mapping stored in php. It sends both ui and suggest to the client. Now the user in the same page instance decides instead to edit a section. The editTool script-loader gets called its dependencies also include jquery.ui. How will the dependency-loader script-server know that the client already has the jquery.ui dependency from the suggest tool?
In the end you need these dependencies mapped out in the JS so that the client can intelligibly request the script set it needs. In that same example if the dependencies where mapped out in js we could avoid re-including jquery.ui.
Alternatively we can just put a crap load of js at the bottom of the page to ensure php knew what could possibly be used for every possible interface interaction chain of events... But the idea is it will be better for page display performance not to try and predict all of that ... so its better to store dependency mapping in javascript. I could give a few more examples if that would be helpful.
- The "class" abstraction as implemented in JS2 has very little value
to PHP callers. It's just as easy to use filenames.
The idea with "class" abstraction is that you don't know what script set you have available at any given time. Maybe one script included ui.resizable and ui.move and now your script depends on ui.resizable and ui.move and ui.drag... your loader call will only include ui.drag (since the other are already defined).
I think you're missing the point. I'm saying it doesn't provide enough features. I want to add more, not take away some. You can remove duplicates by filename.
see above example for why it will be difficult to remove duplicates by file name if your including dependency mappings that are not visible to the js in your script includes.
[...]
We want to move away from php code dependencies for each javascript module. Javascript should just directly hit a single exposure point of the mediawiki api. If we have php code generating bits and pieces of javascript everywhere it quickly gets complicated, is difficult to maintain, much more resource intensive, and requires a whole new framework to work right.
Php's integration with the javascript should be minimal. php should supply configuration, and package in localized msgs.
I don't think it will be too complicated or resource intensive. JS generation in PHP is very flexible and you admit that there is a role for it. I don't think there's a problem with adding a few more features on the PHP side.
If necessary, we can split it back out to a non-MediaWiki standalone mode by generating some static JS.
The nice thing about the way its working right now is you can just turn off the script-loader and the system continues to work ... you can build a page that includes the js and it "works"
Having an export mode, scripts doing transformations, dependency management output sounds complicated. I can imagine it ~sort of~ working... but it seems much easier to go the other way around.
What is your reason for saying this? Have you worked on some other framework where integration of PHP and JavaScript has caused problems?
I am referring more to the php-javascript remoting type systems that seem to try and capture a language functionality inside a separate language. There is inevitably leakage and its complicity is rarely less than a more simple clean separation of systems. (not saying that your suggesting we go to that extream (ie defining most javascript classes and methods in php)
... but trying to map dependencies in that space is a step in that direction and will get complicated for applications interactions that go beyond the initial page display without adding more complexity on the php side.
There's a significant CPU cost to loading and parsing JS files on every PHP request. I want to remove that behaviour. Instead, we can list client-side files in PHP. Then from the PHP list, we can generate static JS files in order to recover the standalone functionality.
As mentioned above I think it would be easier to make the "export" thing work the other way around. Ie instead of running a script to "export" the static javascript. We code our javscript in a way that it works stand alone to begin with and we "export" the information we want into the php.
I agree that the present system of parsing top of the javascipt file on every script-loader generation request is un-optimized. (the idea is those script-loader generations calls happen rarely but even still it should be cached at any number of levels. (ie checking the filemodifcation timestamp, witting out a php or serialized file .. or storing it in any of the other cache levels we have available, memcahce, database, etc )
[snip]
Have you looked at the profiling? On the Wikimedia app servers, even the simplest MW request takes 23ms, and gen=js takes 46ms. A static file like wikibits.js takes around 0.5ms. And that's with APC. You say MW on small sites is OK, I think it's slow and resource-intensive.
That's not to say I'm sold on the idea of a static file cache, it brings its own problems, which I listed.
yea... but almost all script-loader request will be cached. it does not need to check the DB or anything its just a key-file lookup (since script-loader request pass a request key either its there in cache or its not ...it should be on par with the simplest MW request. Which is substantially shorter then around trip time for getting each script individually, not to mention gziping which can't otherwise be easily enabled for 3rd party installations.
[...] The performance impact of refreshing a common file once every hour or two is not large. Your code sets the expiry time to a year, and changes the urid parameter regularly, which sounds great until you accidentally cache some buggy JS into squid and you have no way to reconstruct the URID parameters and thus purge the object. Then you'd be stuck with the choice of either waiting a month for all the referring HTML to expire, or clearing the entire squid cache.
...right... we would want to avoid lots of live hacks. But I think we want to avoid lots of live hacks anyway. A serious javascript bug would only affect the pages that where generated in thous hours that it was a bug was present not the 30 days that your characterizing the lag time of page generation.
Do you have stats on that?... its surprising to me that pages are re-generated that rarely... How do central notice campaigns work?
[...]
Security takes precedence over performance. There are better ways to improve performance than to open up your code to systematic exploit by malicious parties.
ic.... I guess I rarely run into things being displayed are not A) not your own input or B) running through the mediaWiki api ... But your general point is valid. In theory this could come up. (even via api calls)
But... I think it would be just as easy if not easier to check for "escape( val )" as ".text( val ) which would be at the end of a long chain of jquery calls. Or you could set variable values post DOM insertion via .val() or .text() also avoiding non-native dom construction.
I guess it really comes down to readability. I find tabbed html a bit more readable then long chain of jquery elements. But it may be that people find the later more readable... If so I can start heading in that direction. Performance wise I attached a quick test.. seems pretty fast on my machine with a recent firefox build .. but older browsers / machines might be slower...at any rate we should read for both for speed and readability and "security review" ;)
--michael
2009/9/26 Michael Dale mdale@wikimedia.org:
Performance wise I attached a quick test.. seems pretty fast on my machine with a recent firefox build .. but older browsers / machines might be slower...at any rate we should read for both for speed and readability and "security review" ;)
This mailing list scrubs attachments.
Roan Kattouw (Catrope)
On Fri, Sep 25, 2009 at 9:55 PM, Michael Dale mdale@wikimedia.org wrote:
...right... we would want to avoid lots of live hacks. But I think we want to avoid lots of live hacks anyway. Â A serious javascript bug would only affect the pages that where generated in thous hours that it was a bug was present not the 30 days that your characterizing the lag time of page generation.
Do you have stats on that?... its surprising to me that pages are re-generated that rarely... How do central notice campaigns work?
They insert the notice client-side using JavaScript. The HTML served is thus always the same.
On 9/27/09 4:15 AM, Aryeh Gregor wrote:
On Fri, Sep 25, 2009 at 9:55 PM, Michael Dalemdale@wikimedia.org wrote:
...right... we would want to avoid lots of live hacks. But I think we want to avoid lots of live hacks anyway. A serious javascript bug would only affect the pages that where generated in thous hours that it was a bug was present not the 30 days that your characterizing the lag time of page generation.
Do you have stats on that?... its surprising to me that pages are re-generated that rarely... How do central notice campaigns work?
They insert the notice client-side using JavaScript. The HTML served is thus always the same.
Yeah, it's kind of tricky to do right; but if you can keep the loader consistent and compatible, and have predictable expirations on the JS, such things can work pretty reliably.
-- brion
Michael Dale wrote:
That is part of the idea of centrally hosting reusable client-side components so we control the jquery version and plugin set. So a new version won't "come along" until its been tested and integrated.
You can't host every client-side component in the world in a subdirectory of the MediaWiki core. Not everyone has commit access to it. Nobody can hope to properly test every MediaWiki extension.
Most extension developers write an extension for a particular site, and distribute their code as-is for the benefit of other users. They have no interest in integration with the core. If they find some jQuery plugin on the web that defines an interface that conflicts with MediaWiki, say jQuery.load() but with different parameters, they're not going to be impressed when you tell them that to make it work with MediaWiki, they need to rewrite the plugin and get it tested and integrated.
Different modules should have separate namespaces. This is a key property of large, maintainable systems of code.
The nice thing about the way its working right now is you can just turn off the script-loader and the system continues to work ... you can build a page that includes the js and it "works"
The current system kind of works. It's not efficient or scalable and it doesn't have many features.
Having an export mode, scripts doing transformations, dependency management output sounds complicated. I can imagine it ~sort of~ working... but it seems much easier to go the other way around.
Sometimes complexity is necessary in the course of achieving other goals, such as performance, features, and ease of use for extension developers.
I agree that the present system of parsing top of the javascipt file on every script-loader generation request is un-optimized. (the idea is those script-loader generations calls happen rarely but even still it should be cached at any number of levels. (ie checking the filemodifcation timestamp, witting out a php or serialized file .. or storing it in any of the other cache levels we have available, memcahce, database, etc )
Actually it parses the whole of the JavaScript file, not the top, and it does it on every request that invokes WebStart.php, not just on mwScriptLoader.php requests. I'm talking about jsAutoloadLocalClasses.php if that's not clear.
Have you looked at the profiling? On the Wikimedia app servers, even the simplest MW request takes 23ms, and gen=js takes 46ms. A static file like wikibits.js takes around 0.5ms. And that's with APC. You say MW on small sites is OK, I think it's slow and resource-intensive.
That's not to say I'm sold on the idea of a static file cache, it brings its own problems, which I listed.
yea... but almost all script-loader request will be cached. it does not need to check the DB or anything its just a key-file lookup (since script-loader request pass a request key either its there in cache or its not ...it should be on par with the simplest MW request. Which is substantially shorter then around trip time for getting each script individually, not to mention gziping which can't otherwise be easily enabled for 3rd party installations.
I don't think that that comparison can be made so lightly. For the server operator, CPU time is much more expensive than time spent waiting for the network. And I'm not proposing that the client fetches each script individually, I'm proposing that scripts be concatentated and stored in a cache file which is then referenced directly in the HTML.
I'm aware of the gzip issue, I mentioned it in my original post.
...right... we would want to avoid lots of live hacks. But I think we want to avoid lots of live hacks anyway. A serious javascript bug would only affect the pages that where generated in thous hours that it was a bug was present not the 30 days that your characterizing the lag time of page generation.
Bugs don't only come from live hacks. Most bugs come to the site from the developers who wrote the code in the first place, via subversion.
Do you have stats on that?... its surprising to me that pages are re-generated that rarely... How do central notice campaigns work?
$wgSquidMaxage is set to 31 days (2678400 seconds) for all wikis except wikimediafoundation.org. It's necessary to have a very long expiry time in order to fill the caches and achieve a high hit rate, because Wikimedia's access pattern is very broad, with the "long tail" dominating the request rate.
The CentralNotice extension was created to overcome this problem and display short-lived messages. Aryeh described how it works.
-- Tim Starling
Tim Starling wrote:
Michael Dale wrote:
That is part of the idea of centrally hosting reusable client-side components so we control the jquery version and plugin set. So a new version won't "come along" until its been tested and integrated.
You can't host every client-side component in the world in a subdirectory of the MediaWiki core. Not everyone has commit access to it. Nobody can hope to properly test every MediaWiki extension.
Most extension developers write an extension for a particular site, and distribute their code as-is for the benefit of other users. They have no interest in integration with the core. If they find some jQuery plugin on the web that defines an interface that conflicts with MediaWiki, say jQuery.load() but with different parameters, they're not going to be impressed when you tell them that to make it work with MediaWiki, they need to rewrite the plugin and get it tested and integrated.
Different modules should have separate namespaces. This is a key property of large, maintainable systems of code.
Right.. I agree the client side code needs more deployable modularly.
If designing a given component as a jquery plug-in, then I think it makes sense to put it in the jQuery namespace ... otherwise you won't be able to reference jquery things in a predictable way. Alternativly you
I agree that the present system of parsing top of the javascipt file on every script-loader generation request is un-optimized. (the idea is those script-loader generations calls happen rarely but even still it should be cached at any number of levels. (ie checking the filemodifcation timestamp, witting out a php or serialized file .. or storing it in any of the other cache levels we have available, memcahce, database, etc )
Actually it parses the whole of the JavaScript file, not the top, and it does it on every request that invokes WebStart.php, not just on mwScriptLoader.php requests. I'm talking about jsAutoloadLocalClasses.php if that's not clear.
Ah right... previously I had it in php. I wanted to avoid listing it twice but obviously thats a pretty costly way to do that. This will make more sense to put in php if we start splitting up components into the extension folders and generate the path list dynamically for a given feature set.
Have you looked at the profiling? On the Wikimedia app servers, even the simplest MW request takes 23ms, and gen=js takes 46ms. A static file like wikibits.js takes around 0.5ms. And that's with APC. You say MW on small sites is OK, I think it's slow and resource-intensive.
That's not to say I'm sold on the idea of a static file cache, it brings its own problems, which I listed.
yea... but almost all script-loader request will be cached. it does not need to check the DB or anything its just a key-file lookup (since script-loader request pass a request key either its there in cache or its not ...it should be on par with the simplest MW request. Which is substantially shorter then around trip time for getting each script individually, not to mention gziping which can't otherwise be easily enabled for 3rd party installations.
I don't think that that comparison can be made so lightly. For the server operator, CPU time is much more expensive than time spent waiting for the network. And I'm not proposing that the client fetches each script individually, I'm proposing that scripts be concatentated and stored in a cache file which is then referenced directly in the HTML.
I understand. We could even check gziping support at page output time and point to the gziped cached versions (analogous to making direct links to the /script-cache folder of the of the present script-loader setup )
My main question is how will this work for dynamic groups of scripts set post page load that are dictated by user interaction or client state?
Its not as easy to setup static combined output files to point to when you don't know what set of scripts you will be requesting ahead of time.
$wgSquidMaxage is set to 31 days (2678400 seconds) for all wikis except wikimediafoundation.org. It's necessary to have a very long expiry time in order to fill the caches and achieve a high hit rate, because Wikimedia's access pattern is very broad, with the "long tail" dominating the request rate.
oky... so to preserve high cache level you could then have a single static file that lists versions of js with a low expire and the rest with high expire? Or maybe its so cheep to serve static files that it does not mater and just leave everything with a low expire?
--michael
~ dough ~ Disregard previous, bad key stroke sent rather than save to draft.
Tim Starling wrote:
Michael Dale wrote:
That is part of the idea of centrally hosting reusable client-side components so we control the jquery version and plugin set. So a new version won't "come along" until its been tested and integrated.
You can't host every client-side component in the world in a subdirectory of the MediaWiki core. Not everyone has commit access to it. Nobody can hope to properly test every MediaWiki extension.
Most extension developers write an extension for a particular site, and distribute their code as-is for the benefit of other users. They have no interest in integration with the core. If they find some jQuery plugin on the web that defines an interface that conflicts with MediaWiki, say jQuery.load() but with different parameters, they're not going to be impressed when you tell them that to make it work with MediaWiki, they need to rewrite the plugin and get it tested and integrated.
Different modules should have separate namespaces. This is a key property of large, maintainable systems of code.
Right.. I agree the client side code needs more deployable modularly. It just tricky to manage all those relationships in php, but it appears it will be necessary to do so...
If designing a given component as a jQuery plug-in, then I think it makes sense to put it in the jQuery namespace ... otherwise you won't be able to reference jQuery things locally and no-conflict compatible way. Unless we create a mw wrapper of some sorts but I don't know how necessary that is atm... i guess it would be slightly cleaner.
I agree that the present system of parsing top of the javascipt file on every script-loader generation request is un-optimized. (the idea is those script-loader generations calls happen rarely but even still it should be cached at any number of levels. (ie checking the filemodifcation timestamp, witting out a php or serialized file .. or storing it in any of the other cache levels we have available, memcahce, database, etc )
Actually it parses the whole of the JavaScript file, not the top, and it does it on every request that invokes WebStart.php, not just on mwScriptLoader.php requests. I'm talking about jsAutoloadLocalClasses.php if that's not clear.
Ah right... previously I had it in php. I wanted to avoid listing it twice but obviously thats a pretty costly way to do that. This will make more sense to put in php if we start splitting up components into the extension folders and generate the path list dynamically for a given feature set.
Have you looked at the profiling? On the Wikimedia app servers, even the simplest MW request takes 23ms, and gen=js takes 46ms. A static file like wikibits.js takes around 0.5ms. And that's with APC. You say MW on small sites is OK, I think it's slow and resource-intensive.
That's not to say I'm sold on the idea of a static file cache, it brings its own problems, which I listed.
yea... but almost all script-loader request will be cached. it does not need to check the DB or anything its just a key-file lookup (since script-loader request pass a request key either its there in cache or its not ...it should be on par with the simplest MW request. Which is substantially shorter then around trip time for getting each script individually, not to mention gziping which can't otherwise be easily enabled for 3rd party installations.
I don't think that that comparison can be made so lightly. For the server operator, CPU time is much more expensive than time spent waiting for the network. And I'm not proposing that the client fetches each script individually, I'm proposing that scripts be concatentated and stored in a cache file which is then referenced directly in the HTML.
I understand. (its analogous to making direct links to the /script-cache folder instead of requesting the files through the script-loader entry point )
My main question is how will this work for dynamic groups of scripts set post page load that are dictated by user interaction or client state?
Do we just ignore this possibly and grab any necessary module components based on pre-defined module sets in php that get passed down to javascript?
Its not as easy to setup static combined output files to point to when you don't know what set of scripts you will be requesting...
hmm... if we had a predictable key format we could do a request for the static file. if we get a 404 then we do a request a dynamic request to generate the static file?.. Subsequent interactions would hit that static file? that seems ugly though.
$wgSquidMaxage is set to 31 days (2678400 seconds) for all wikis except wikimediafoundation.org. It's necessary to have a very long expiry time in order to fill the caches and achieve a high hit rate, because Wikimedia's access pattern is very broad, with the "long tail" dominating the request rate.
oky... so to preserve high cache level you could have a single static file that lists versions of js with a low expire and the rest with high expire? Or maybe its so cheep to serve static files that it does not mater and just leave everything with a low expire?
--michael
Side note. Multiple versions of jQuery can live happily on the same page. jQuery handles isolation and noConflict so well that it can work on the same page as incompatible versions of itself (which isn't the case for basically any other js library, 90% of which prototype stuff in).
I like to use a variable like `jQuery13`, basically saving the jQuery variable in an alternate variable identified by major version number. Should an upgrade come along it's a little easier to migrate code.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Tim Starling wrote:
Michael Dale wrote:
That is part of the idea of centrally hosting reusable client-side components so we control the jquery version and plugin set. So a new version won't "come along" until its been tested and integrated.
You can't host every client-side component in the world in a subdirectory of the MediaWiki core. Not everyone has commit access to it. Nobody can hope to properly test every MediaWiki extension.
Most extension developers write an extension for a particular site, and distribute their code as-is for the benefit of other users. They have no interest in integration with the core. If they find some jQuery plugin on the web that defines an interface that conflicts with MediaWiki, say jQuery.load() but with different parameters, they're not going to be impressed when you tell them that to make it work with MediaWiki, they need to rewrite the plugin and get it tested and integrated.
Different modules should have separate namespaces. This is a key property of large, maintainable systems of code. ... -- Tim Starling
On 9/24/09 1:41 AM, Tim Starling wrote:
Trevor Parscal wrote:
If you are really doing a JS2 rewrite/reorganization, would it be possible for some of us (especially those of us who deal almost exclusively with JavaScript these days) to get a chance to ask questions/give feedback/help in general?
I've mostly been working on analysis and planning so far. I made a few false starts with the code and so ended up planning in a more detailed way than I initially intended. I've discussed various issues with the people in #mediawiki, including our resident client-side guru Splarka.
I started off working on fixing the coding style and the most glaring errors from the JS2 branch, but I soon decided that I shouldn't be putting so much effort into that when a lot of the code would have to be deleted or rewritten from scratch.
I did a survey of script loaders in other applications, to get an idea of what features would be desirable. My observations came down to the following:
- The namespacing in Google's jsapi is very nice, with everything
being a member of a global "google" object. We would do well to emulate it, but migrating all JS to such a scheme is beyond the scope of the current project.
- You need to deal with CSS as well as JS. All the script loaders I
looked at did that, except ours. We have a lot of CSS objects that need concatenation, and possibly minification.
- JS loading can be deferred until near the</body> or until the
DOMContentLoaded event. This means that empty-cache requests will render faster. Wordpress places emphasis on this.
- Dependency tracking is useful. The idea is to request a given
module, and all dependencies of that module, such as other scripts, will automatically be loaded first.
I then looked more closely at the current state of script loading in MediaWiki. I made the following observations:
- Most linked objects (styles and scripts) on a typical page view come
from the Skin. If the goal is performance enhancement, then working on the skins and OutputPage has to be a priority.
- The "class" abstraction as implemented in JS2 has very little value
to PHP callers. It's just as easy to use filenames. It could be made more useful with features such as dependency tracking, better concatenation and CSS support. But it seems to me that the most useful abstraction for PHP code would be for client-side modules to be multi-file, potentially with supporting PHP code for each module.
- Central registration of all client-side resources in a global
variable would be onerous and should be avoided.
- Dynamic requests such as [[MediaWiki:Handheld.css]] have a large
impact on site performance and need to be optimised. I'm planning a new interface, similar to action=raw, allowing these objects to be concatenated.
The following design documents are in my user space on mediawiki.org:
http://www.mediawiki.org/wiki/User:Tim_Starling/CSS_and_JS_caller_survey_(r56220)
- A survey of MW functions that add CSS and JS, especially the
terribly confusing situation in Skin and OutputPage
http://www.mediawiki.org/wiki/User:Tim_Starling/JS_load_order_issues_(r56220)
- A breakdown of JS files by the issues that might be had in moving
them to the footer or DOMContentLoaded. I favour a conservative approach, with wikibits.js and the site and user JS staying in the
<head>.
http://www.mediawiki.org/wiki/User:Tim_Starling/Proposed_modularisation_of_client-side_resources
- A proposed reorganisation of core scripts (Skin and OutputPage)
according to the MW modules they are most associated with.
The object model I'm leaning towards on the PHP side is:
- A client-side resource manager (CSRM) class. This would be
responsible for maintaining a list of client-side resources that have been requested and need to be sent to the skin. It would also handle caching, distribution of incoming dynamic requests, dependencies, minification, etc. This is quite a complex job and might need to be split up somewhat.
- A hierarchy of client-side module classes. A module object would
contain a list of files, dependencies and concatenation hints. Objects would be instantiated by parent classes such as skins and special pages, and added to the CSRM. Classes could be registered globally, and then used to generate dynamic CSS and JS, such as the user preference stylesheet.
- The module base class would be non-abstract and featureful, with a
constructor that accepts an array-based description. This allows simple creation of modules by classes with no interest in dynamic script generation.
- A new script loader entry point would provide an interface to
registered modules.
There are some design decisions I still have to make, which are tricky due to performance tradeoffs:
- With concatenation, there is the question of which files to combine
and which to leave separate. I would like to have a "combine" parameter which is a string, and files with the same combine parameter will be combined.
- Like Wordpress, we could store minified and concatenated files in a
public cache and then link to that cache directly in the HTML.
- The cache invalidation scheme is tricky, there's not really an ideal
system. A combination of cache-breaking parameters (like Michael's design) and short expiry times is probably the way to go. Using cache-breaking parameters alone doesn't work because there is referring HTML cached on both the server and client side, and regenerating that HTML periodically would be much more expensive than regenerating the scripts.
Here are my notes:
- Concatenation
- Performance problems:
- Changing inclusions. When inclusions change, whole contents has
to be sent again. * BUT people don't change skins very often. * So combine=all=skin should save time for most * Expiry times have to be synchronised. Take the minimum expiry of all, and force freshness check for all. * Makes the task of squid cache purging more difficult * Defeats browser concurrency
Performance advantages:
- For dynamic requests:
- Avoids MW startup time.
- Avoids DoSing small servers with concurrent requests.
- For all requests:
- Reduces squid CPU
- Removes a few RTTs for non-pipelining clients
- Improves gzip compression ratio
Combine to static file idea:
- Pros:
- Fast to stream out, on all systems
- Doesn't break HughesNet
- Cons:
- Requires splitting the request into static and dynamic
- Need webserver config to add Expires header and gzip
With some help from Splarka, I've determined that it would be possible to merge the requests for [[MediaWiki:Common.css]], [[MediaWiki:Skinname.css]], [[MediaWiki:Handheld.css]] and [[MediaWiki:Print.css]], using @media blocks for the last two, for a significant performance win in almost all cases.
Once the architectural issues have been fixed, the stylistic issues in both ancient JS and the merged code will have to be dealt with, for example:
- Poorly-named functions, classes, files, etc. There's a need for
proper namespacing and consistency in naming style.
Poorly-written comments
Unnecessary use of the global namespace. The jQuery style is nice,
with local functions inside an anonymous closure:
function () { function setup() { ... } addOnloadHook( setup ); }();
- Unsafe construction of HTML. This is ubiquitous in the mwEmbed
directory and there will be a huge potential for XSS, as soon as user input is added. HTML construction with innerHTML can be replaced by document.createElement() or its jQuery equivalent.
- The identity crisis. The whole js2 concept encourages code which is
poorly integrated with the rest of MediaWiki, and which is written without proper study of the existing code or thought to refactoring. It's like SkinTemplate except with a more pretentious name. I'd like to get rid of all instances of "js2", to move its scripts into other directories, and to remove the global variables which turn it on and off. Also the references to MetavidWiki and the mv prefixes should be fixed.
- Lack of modularisation. The proposed registration system makes it
possible to have extensions which are almost entirely client-side code. A module like libClipEdit could be moved to its own extension. I see no problem with extensions depending on other extensions, the SMW extensions do this with no problems.
A few ideas for cool future features also occur to me. Once we have a system set up for generating and caching client-side resources, why not:
- Allow the user to choose a colour scheme for their wiki and
automatically generate stylesheets with the appropriate colours.
- Include images in the system. Use GD to automatically generate and
cache images with the appropriate anti-aliased background colour.
- Automatically create CSS sprites?
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It's great to see that this is being paid attention to. I would agree with you that the current implementation of JS2 is not what I see as ideal either.
The use of "class" loading seems a little strange to me as well - I mean, there's not really such thing as a class in JavaScript, nor does the class loaded only load a specific JavaScript object or function, so it's really more of a file loader - if we drop the .js from the file names in a system where some resources are MediaWiki messages who's names also end in .js, thats a purely aesthetic maneuver - I'm find either way, but let's not call it something it's not. It's a file loader.
The dependency thing is an interesting problem, but I think it could be handled more elegantly than having to define meta-information. Just an idea for a solution...
1. Other than jQuery and a MediaWiki jquery plugin, scripts can be loaded on the client in any order 2. Each script after that adds code to a queuing system provided by the MediaWiki plugin 3. Code is identified by a name and may include an optional list of the names for any dependencies. 4. When document ready happens, the queuing generates an order for execution based on given dependencies. 5. Even after document ready, the queuing system can continue it's work whenever a script is added - such that if "bar" which depends on "foo" is registered before document.ready, and then sometime well after document.ready "foo" is run using the queuing system, "bar" will be executed directly after because it's dependency has finally been met.
// Hypothetical code...
// Example of points 3 and 4 ($.run is provided by the MediaWiki jQuery plugin) $.run( 'foo', function() { /* bar code */ } ); $.run( 'bar', ['foo'], function() { /* bar code */ } ); // document on load happens, foo is executed, bar is executed
// Example of point 5 $.run( 'bar', ['foo'], function() { /* bar code */ } ); // document.ready happens .. time passes $.run( 'foo', function() { /* bar code */ } ); // bar is executed now
I think there is a clever way to merge a solution for dynamic script loading into this as well... But essentially this solves most problems already.
Ideally dynamic script loading would never be needed, as it introduces additional latency to user interaction, and no amount of spinner graphics will ever replace faster interaction. Lazy script loading however is awesome, and should be considered in these design changes. For lazy loading, we could tell $wgOut that a script being included is either to be included immediately, or after document.ready - in which case a bit-o-JavaScript could be added to the page listing which files to load - which could be acted upon after the document is ready.
Let's also try and pay attention to the issue of i18n for dynamic UI elements. So far I've been defining a long list of messages to include in a JSON object in my PHP code, then using them in my JavaScript code. Michael has some magic going on in his script loader that does some injection of message values based on their presence in the js file (not totally clear on the details there). I think once again I would like to see that we let messages required for use in JavaScript be defined in JavaScript - so something like what michael is doing seems ideal...
// Code in .js file loadMessages( ['foo', 'bar'] ); // Code in JavaScript sent to client after magic transformations made by PHP code loadMessages( { 'foo': 'Foo', 'bar': 'Bar' } );
Thus allowing us to define messages we want loaded in the JavaScript space without making additional (and very latent) calls to the server just to get some text. Even in the case of dynamic script loading, the messages of the incoming script just get added to the collection on the client. I think this is similar if not identical to what Micheal's code does.
Bottom line, meta-info about things that go on in JavaScript land being defined and dealt with in PHP land is not a good thing - and it should be avoided. The good thing is, there are all sorts of clever ways to do so.
I'm still digesting some of the other topics being brought up - there are so many good points - I'm sure I will have more input soon...
- Trevor
Here's what I'm taking out of this thread:
* Platonides mentions the case of power-users with tens of scripts loaded via gadgets or user JS with importScript(). * Tisza asks that core onload hooks and other functions be overridable by user JS. * Trevor and Michael both mention i18n as an important consideration which I have not discussed. * Michael wants certain components in the js2 directory to be usable as standalone client-side libraries, which operate without MediaWiki or any other server-side application.
-- Tim Starling
I got another, not from the thread of course. I'd like addOnloadHook to be replaced by jQuery's ready which does a much better job of handling load events.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Tim Starling wrote:
Here's what I'm taking out of this thread:
- Platonides mentions the case of power-users with tens of scripts loaded via
gadgets or user JS with importScript().
- Tisza asks that core onload hooks and other functions be overridable by user JS.
- Trevor and Michael both mention i18n as an important consideration which I
have not discussed.
- Michael wants certain components in the js2 directory to be usable as
standalone client-side libraries, which operate without MediaWiki or any other server-side application.
-- Tim Starling
we have js2AddOnloadHook that gives you jquery in no conflict as $j variable the idea behind using a different name is to separate jquery based code from the older non-jquery based code... but if taking a more iterative approach we could replace the addOnloadHook function.
--michael
Daniel Friesen wrote:
I got another, not from the thread of course. I'd like addOnloadHook to be replaced by jQuery's ready which does a much better job of handling load events.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Tim Starling wrote:
Here's what I'm taking out of this thread:
- Platonides mentions the case of power-users with tens of scripts loaded via
gadgets or user JS with importScript().
- Tisza asks that core onload hooks and other functions be overridable by user JS.
- Trevor and Michael both mention i18n as an important consideration which I
have not discussed.
- Michael wants certain components in the js2 directory to be usable as
standalone client-side libraries, which operate without MediaWiki or any other server-side application.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Sep 28, 2009 at 1:21 AM, Tim Starling tstarling@wikimedia.org wrote:
- Platonides mentions the case of power-users with tens of scripts loaded via
gadgets or user JS with importScript().
Also remember the possibility that sysops will want to include these scripts (conditionally or unconditionally) from MediaWiki:Common.js or such. Look at the top of http://en.wikipedia.org/wiki/MediaWiki:Common.js, which imports specific scripts only on edit/preview/upload; only on watchlist view; only for sysops; only for IE6; and possibly others. It also imports Wikiminiatlas unconditionally, it seems. I don't see offhand how sysop-created server-side conditional includes could be handled, but it's worth considering at least unconditional includes, since sysops might want to split code across multiple pages for ease of editing.
Aryeh Gregor wrote:
Also remember the possibility that sysops will want to include these scripts (conditionally or unconditionally) from MediaWiki:Common.js or such. Look at the top of http://en.wikipedia.org/wiki/MediaWiki:Common.js, which imports specific scripts only on edit/preview/upload; only on watchlist view; only for sysops; only for IE6; and possibly others. It also imports Wikiminiatlas unconditionally, it seems. I don't see offhand how sysop-created server-side conditional includes could be handled, but it's worth considering at least unconditional includes, since sysops might want to split code across multiple pages for ease of editing.
This highlights the complexity of managing all javascript dependences on the server side... If possible the script-loader should dynamically handle these requests. For wikimedia its behind a squid proxy so should not be too bad. For small wikis we could setup a dedicated entry point that could first check the file cache key before loading all webstart.php, parsing javascript classes and all the other costly mediaWIki web engine stuff.
Has anyone done any scalability studies into minimal php @readfile script vs apache serving the file. Obviously apache will server the file a lot faster but a question I have is at what file size does it saturate disk reads as opposed to saturated CPU?
--michael
On Wed, Sep 30, 2009 at 3:32 PM, Michael Dale mdale@wikimedia.org wrote:
Has anyone done any scalability studies into minimal php @readfile script vs apache serving the file. Obviously apache will server the file a lot faster but a question I have is at what file size does it saturate disk reads as opposed to saturated CPU?
It will never be disk-bound unless the site is tiny and/or has too little RAM. The files can be expected to remain in the page cache perpetually as long as there's a constant stream of requests coming in. If the site is tiny, performance isn't a big issue (at least not for the site operators). If the server has so little free RAM that a file that's being read every few minutes and is under a megabyte in size is consistently evicted from the cache, then you have bigger problems to worry about.
... that makes sense .. ( on the side I was looking into a fall-back ogg video serving solution that would hit the disk issue) .. but in this context your right .. its about saturating the network port ....
Since network ports are generally pretty fast, a test on my laptop may be helpful: (running PHP 5.2.6-3ubuntu4.2 & Apache/2.2.11 Intel Centrino 2Ghz )
Lets take a big script-loader request running from "memory" say the firefogg advanced encoder javascript set (from the trunk...I made the small modifications Tim suggested ie (don't parse the javascript file to get the class list) #ab -n 1000 -c 100 "http://localhost/wiki_trunk/js2/mwEmbed/jsScriptLoader.php?urid=18&class..." result is:
Concurrency Level: 100 Time taken for tests: 1.134 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 64019000 bytes HTML transferred: 63787000 bytes Requests per second: 881.54 [#/sec] (mean) Time per request: 113.437 [ms] (mean) Time per request: 1.134 [ms] (mean, across all concurrent requests) Transfer rate: 55112.78 [Kbytes/sec] received
So we are hitting near 900 request per second on my 2 year old laptop. Now if we take the static minified combined file which is 239906 instead of 64019 bytes we should of-course get much higher RPS going direct to apache:
#ab -n 1000 -c 100 http://localhost/static_combined.js Concurrency Level: 100 Time taken for tests: 0.604 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 240385812 bytes HTML transferred: 240073188 bytes Requests per second: 1655.18 [#/sec] (mean) Time per request: 60.416 [ms] (mean) Time per request: 0.604 [ms] (mean, across all concurrent requests) Transfer rate: 388556.37 [Kbytes/sec] received
Here we get near 400MBS and around 2x times the Request per second...
At a cost of about 1/2 as many requests you can send the content to people 3 times as small (ie faster). Of course none of this applies to wikimedia setup where these would all be squid proxy hits. \
I hope this shows that we don't necessarily "have to" point clients to static files, and that php pre-processing the cache is not quite as costly as Tim outlined (if we setup an entry point that first checks the disk cache before loading in all of mediaWiki php )
Additionally most mediaWiki installs out there are probably not serving up thousands of request per second. (and those that are are probably have proxies setup).. So the gziping php proxy of js requests is worth while.
--michael
Aryeh Gregor wrote:
On Wed, Sep 30, 2009 at 3:32 PM, Michael Dale mdale@wikimedia.org wrote:
Has anyone done any scalability studies into minimal php @readfile script vs apache serving the file. Obviously apache will server the file a lot faster but a question I have is at what file size does it saturate disk reads as opposed to saturated CPU?
It will never be disk-bound unless the site is tiny and/or has too little RAM. The files can be expected to remain in the page cache perpetually as long as there's a constant stream of requests coming in. If the site is tiny, performance isn't a big issue (at least not for the site operators). If the server has so little free RAM that a file that's being read every few minutes and is under a megabyte in size is consistently evicted from the cache, then you have bigger problems to worry about.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org