Wikitech-l September 2010

wikitech-l@lists.wikimedia.org

102 participants
93 discussions

by Hugo Vincent

Hi everyone, I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/) and I need to extra the content from it and convert it into LaTeX syntax for printed documentation. I have googled for a suitable OSS solution but nothing was apparent. I would prefer a script written in Python, but any recommendations would be very welcome. Do you know of anything suitable? Kind Regards, Hugo Vincent, Bluewater Systems.

11 years, 10 months

ResourceLoader, now in trunk!

by Trevor Parscal

MediaWiki Developers, Over the past couple of months, Roan Kattouw and I (Trevor Parscal) have been working on a JavaScript and CSS delivery system called ResourceLoader. We're really excited about this technology, and hope others will be too. This system has been proving itself to be able to seriously improve front-end performance. Just for starters, we're talking about taking the Vector skin from 35 requests @ 30kB gzipped to 1 request @ 9.4kB gzipped (see http://www.mediawiki.org/wiki/ResourceLoader/Benchmarks) We are looking to make this the standard way to deliver Javascript, CSS, and small images in MediaWiki and on Wikimedia projects, and we're seeking your comments and help. == Background == The goals of the project were to improve front-end performance, reduce the complexity of developing JavaScript libraries and user-interfaces, and get the ball rolling on a rewrite/refactoring of all JavaScript and CSS code in MediaWiki. What's wrong with things as they are now? * Too many individual requests are being made. All JavaScript, CSS and image resources are being loaded individually, which causes poor performance on the cluster and users experience the site as being slow. * We are wasting too much bandwidth. We are sending JavaScript and CSS resources with large amounts of unneeded whitespace and comments. * We are purging our caches too much. Many user interface changes require purging page caches to take effect and many assets are unnecessarily being purged from client machines due to the use of a single style version for all assets * We are sending people code they don't even use. Lots of JavaScript is being sent to clients whose browsers will either crash when it arrives (BlackBerry comes to mind), just not use it at all (older versions of many browsers) while parsing it unnecessarily (this is slow on older browsers, especially IE 6) or isn't even being completely utilized (UsabilityInitiative's plugins.combined.min.js for instance) * Internationalization in JavaScript is a mess. Developers are using many different ways -- most of which are not ideal -- to get their translated messages to the client. * Right-to-left support in CSS is akward. Stylesheets for right-to-left must to be either hand-coded in a separate stylesheet, generated each time a change is made by running CSSJanus, or an extra style-sheet which contains a series of over-rides. * There's more! These and other issues were captured in our requirements gathering process (see http://www.mediawiki.org/wiki/ResourceLoader/Requirements ) What does ResourceLoader do to solve this? * Combines resources together. Multiple scripts, styles, messages to be delivered in a single request, either at initial page load or dynamically; in both cases resolving dependencies automatically. * Allows minification of JavaScript and CSS. * Dramatically reduces the number of requests for small images. Small images linked to from CSS code can be automatically in-lined as data URLs (when the developer marks it with a special comment), and it's done automatically as the file is served without requiring the developer to do such steps manually. * Allows deployment changes to all pages for all users within minutes, without purging any HTML. ResourceLoader provides a short-expiry start-up script which then decides to continue loading more JavaScript or not, and if so has a complete manifest of all scripts and styles on the server and their most recent versions, Also, this startup script will be able to be inlined using ESI (see http://en.wikipedia.org/wiki/Edge_Side_Includes ) when using Squid or Varnish, reducing requests and improving performance even further. * Provides a standard way to deliver translated messages to the client, bundling them together with the code that uses them. * Performs automatic left-to-right/right-to-left flipping for CSS files. In most cases the developer won't have to do anything before deploying. * Does all kinds of other cool tricks, which should soon make everyone's lives better What do you want from me? * Help by porting existing code! While ResourceLoader and traditional methods of adding scripts to MediaWiki output can co-exist, the performance gains of ResourceLoader are directly related to the amount of software utilizing it. There's some more stuff in core that needs to be tweaked to utilize the ResourceLoader system, such as user scripts and site CSS. We also need extensions to start using it, especially those we are deploying on Wikimedia sites or thinking about deploying soon. Only basic documentation exists on how to port extensions, but much more will be written very shortly and we (Roan and I) be leading by example by porting the UsabilityInitiative extensions ourselves. If you need help, we're usually on IRC. (See http://www.mediawiki.org/wiki/ResourceLoader/Getting_Started ) * Help writing new code! While wikibits.js is now also known as the "mediawiki.legacy.wikibits" module, the functionality that it and basically all other existing MediaWiki JavaScript code provide is being deprecated, in favor of new modules which take advantage of jQuery and can be written using a lot less code while eliminating the current dependence on a large number of globally accessible variables and functions (see http://www.mediawiki.org/wiki/ResourceLoader/JavaScript_Deprecations ) * Some patience and understanding... Please... While we are integrating into trunk, things might break unexpectedly. We're diligently tracking down issues and resolving them as fast as we can, but help in this regard is much needed and really appreciated. But most of all, we're sorry if something gets screwed up, and we're trying our best to make this integration smooth. * Enthusiasm! Documentation is coming online as fast as we can write it. There's a very detailed design specification document at http://www.mediawiki.org/wiki/ResourceLoader/Design_Specification and more information in general at http://www.mediawiki.org/wiki/ResourceLoader , where we will be adding more and more documentation as time goes on. If you can help with documentation, please feel free to edit boldly - just try not to modify the design specification unless you are also modifying the software :) While this project has been bootstrapped by Roan and myself in a branch, we're really excited about bringing it to trunk and hope the community can start taking advantage of the new features right away. Tracking bug for tracking things that ResourceLoader will fix: http://bugzilla.wikimedia.org/show_bug.cgi?id=24415 Bugzilla Component: https://bugzilla.wikimedia.org/buglist.cgi?query_format=advanced&bug_status… - Trevor (and Roan, who's committing the merge to SVN right now)

13 years, 5 months

Backups

by Eugene

Hi everyone, Are there any updates regarding Wikipedia's backup systems? I've created a bug to track this at https://bugzilla.wikimedia.org/show_bug.cgi?id=18255. Cheers, Eugene

13 years, 5 months

ResourceLoader Debug Mode

by Trevor Parscal

There seems to be some confusion about how ResourceLoader works, which has been leading people to make commits like r73196 and report bugs like #25362. I would like to offer some clarification. ResourceLoader, if you aren't already aware, is a new system in MediaWiki 1.17 which allows developers to bundle collections of *resources* (like JavaScript and CSS files, or localized messages) into *modules*. Modules may represent any number of scripts, styles and messages, which are read from the file system, the database, or generated by software. When a request is made for one or more modules, each resource is packaged together and sent back to the client as a response. The way in which these requests and responses are performed depends on whether debug is on or off. When debug mode is off: * Modules are requested in batches * Resources are combined into modules * Modules are combined into a response * The response is minified When debug mode is on: * Modules are requested individually * Resources are combined into modules I think it's debatable whether debug=true mode goes far enough, since it still combines resources into modules, and I am open to contributions that can make debug=true mode even more debugging friendly by delivering the resources to the client as unchanged as possible. I also think it's debatable if debug=false mode goes far enough, since things like Google Closure Compiler have been proven to even further reduce the size of JavaScript resources, so I am also open to contributions which can make debug=false even more production friendly by improving front-end performance. The commits and bugs that I'm contending here are ones which are aiming to dilute the optimized nature of debug=false mode, when debug=true mode is really what they should be using or improving. These kinds of changes and suggestions result in software that is neither optimized for debugging or for production, making the front-end performance of the site in production slower without making it any easier to debug than it would have been by using debug=true. If you are a developer, working on your localhost, you probably want to code with... $wgResourceLoaderDebug = true; .. and then test that things work in debug=false mode before committing your code. This will result in more requests but less processing, which will be much faster when developing on localhost. I hope this helps clarify this situation. - Trevor

13 years, 6 months

using parserTests code for selenium test framework

by Dan Nessett

I have been tasked to evaluate whether we can use the parserTests db code for the selenium framework. I just looked it over and have serious reservations. I would appreciate any comments on the following analysis. The environment for selenium tests is different than that for parserTests. It is envisioned that multiple concurrent tests could run using the same MW code base. Consequently, each test run must: + Use a db that if written to will not destroy other test wiki information. + Switch in a new images and math directory so any writes do not interfere with other tests. + Maintain the integrity of the cache. Note that tests would *never* run on a production wiki (it may be possible to do so if they do no writes, but safety considerations suggest they should always run on a test data, not production data). In fact production wikis should always retain the setting $wgEnableSelenium = false, to ensure selenium test are disabled. Given this background, consider the following (and feel free to comment on it): parserTests temporary table code: A fixed set of tables are specified in the code. parserTests creates temporary tables with the same name, but using a different static prefix. These tables are used for the parserTests run. Problems using this approach for selenium tests: + Selenium tests on extensions may require use of extension specific tables, the names of which cannot be elaborated in the code. + Concurrent test runs of parserTests are not supported, since the temporary tables have fixed names and therefore concurrent writes to them by parallel test runs would cause interference. + Clean up from aborted runs requires dropping fossil tables. But, if a previous run tested an extension with extension-specific tables, there is no way for a test of some other functionality to figure out which tables to drop. For these reasons, I don't think we can reuse the parserTests code. However, I am open to arguments to the contrary. -- -- Dan Nessett

13 years, 6 months

Selenium Framework - test run configuration data

by Dan Nessett

Back in June the Selenium Framework had a local configuration file called LocalSeleniumSettings.php. This was eliminated by Tim Starling in a 6/24 commit with the comment that it was an insecure concept. In that commit, new globals were added that controlled test runs. Last Friday, mah ripped out the globals and put the configuration information into the execute method of RunSeleniumTests.php with the comment "@todo Add an alternative where settings are read from an INI file." So, it seems we have dueling developers with contrary ideas about what is the best way to configure selenium framework tests. Should configuration data be exposed as globals or hidden in a local configuration file? Either approach works. But, by going back and forth, it makes development of functionality for the Framework difficult. I am working on code not yet submitted as a patch that now requires reworking because how to reference configuration data has changed. We need a decision that decides which of the two approaches to use. -- -- Dan Nessett

13 years, 6 months

Attention MediaWiki Hackers!

by Chad

(sending to main tech lists, crossposted to Tech Blog[0], feel free to forward anywhere else you'd like) Greetings MediaWiki hackers! I am pleased to announce the upcoming MediaWiki Hack-A-Ton in Washington, DC. As you are all aware, every year in April our good friends at Wikimedia Deutschland host the annual "MediaWiki Developers Meetup" in Berlin. At that event, the program is focused on demonstrations, workshops and small group discussions. To complement this, we're planning the DC meetup to be focused solely on hacking, bugfixing and getting down and dirty with the code. We're scheduling this for October 22nd-24th in Washington, DC. Some of the details haven't been ironed out yet, but will be announced over the coming days as it is. So clear your calendars, and keep your eyes on MediaWiki.org[1] and the mailing lists for more information. Some travel assistance may be available for those coming a long way. I've also been told there will be swag of some sort for attendees :) -Chad [0] http://techblog.wikimedia.org/2010/09/hack-a-ton-dc/ [1] http://www.mediawiki.org/wiki/Hack-A-Ton_DC

13 years, 6 months

ResourceLoader timestamp format

by Trevor Parscal

Early on in the requirements stage of ResourceLoader development we decided to use ISO8601 as the format for representing timestamps in URLs. This was chosen for it's legibility, conformance to a standard and ease of generation. However this was somewhat of an oversight since the timestamp "1970-01-01T00:00:00Z" gets URL encoded to be "1970-01-01T00%3A00%3A00Z" which leaves something to be desired. Also, generating this format in JavaScript requires sending a extra 220 bytes (minified and compressed). So, before we seal the deal on using 8601, I would like to collect some ideas about alternatives which would ideally... * Be legible in a URL * Conform to a well-defined/well-known standard * Be easy to generate from a unix timestamp in both PHP and JavaScript Proposals wanted. - Trevor

13 years, 6 months

Code review for the next little while

by Rob Lanphier

Hi everyone, You all probably have noticed more people getting involved in code review (and of course saw Brion's mail). This is partly in anticipation of Tim being afk, and partly because we're long overdue for distributing the load. Here's who we have available for code review, and what they'll be focused on: * Brion - general review, see his mail from earlier this week * Chad - general review * Roan - ResourceLoader, API, CentralNotice, UploadWizard * Trevor - general review, mostly front-end * Tim - general review * Mark - general review as available I'll let each of them elaborate on their areas of focus. I imagine we will want to give the code review pages on mediawiki.org some love in the coming days and weeks, starting here: http://www.mediawiki.org/wiki/Code_review ...and: http://www.mediawiki.org/wiki/Requests_for_review We have a number of related pages that potentially need to be merged, reorganized, or deleted. I'll plug away at this, and I'll appreciate any help on this (be bold; we'll revert if we don't like). Rob

13 years, 6 months

Drafting the upcoming engineering overview

by Rob Lanphier

Hi everyone, As you probably know, we're trying to get into the habit of providing a monthly overview of all WMF-sponsored engineering activity. The September update was posted to the techblog here: http://techblog.wikimedia.org/2010/09/wmf-engineering/ For October, we'd like to draft this in public so as to get the information out a little sooner, and to give you all the opportunity to help out. Here's where we're drafting this: http://www.mediawiki.org/wiki/WMF_Engineering_Overview_October_2010 Here's a very simple way you can help. If you see something on the list that you're interested in, but don't see the status for yet, ping one of us, then be bold and add what you learn to the appropriate wiki page. If you do know the status, by all means add it. Another useful thing to do: you'll notice that many of the project pages that the status post links to are pretty sparse. Same rules apply there. We'd love to get help keeping this up to date. Thanks! Rob

13 years, 6 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l September 2010