On Thu, Oct 21, 2010 at 5:18 PM, Neil Kandalgaonkar neilk@wikimedia.orgwrote:
The main insight here is that branching is a bad way for a website to manage change. We do not have an install base that's out there in the world, like shrink-wrapped software, where we issue patches on CD. For a website, we control the entire install base.[1]
Of course MediaWiki is a product for third-party use, too. :)
Doing things the Flickr way entirely would require:
1 - A "feature flag" system, for "branching in code". The point is to start developing a new feature with it being turned off by default for most environments and without succumbing to branching and merging misery. In other words, day one of a new feature looks like this:
if ( $wgFeature['MyNewThing'] ) { /* ... new code ... */ } else { /* ... old code ... */ }
Many features in MediaWiki have been developed in *exactly* this way, either hidden behind a configuration switch or encapsulated within an extension which simply isn't enabled until it's ready.
Where this really falls down is where you're refactoring a big subsystem; in some cases we can keep the entire "new" system separate, and move things over bit by bit -- and sometimes we've done exactly that -- but it can be difficult if there are a lot of dependencies that need to be touched because interfaces are changing. (Think of ResourceLoader and its predecessors as an example here; lots of little things had to change just to get it in... but there's still code that uses a lot of old systems just fine and can be cleaned up bit by bit.)
It falls down more moderately when you're simply "fixing" or "enhancing" code, and don't realize that you just introduced some breakage.
2 - Every developer with commit access is thinking about deployment onto a cluster of machines all the time. Committing to the repository means you are asserting this will work in production. (This is the hard part for us, I think, but maybe not insurmountable).
That's exactly what people are supposed to think when committing to MediaWiki trunk ever since we switched to the continuous integration w/ quarterly release cycle a few years ago. Breakage in trunk is certainly not something you're EVER supposed to do on purpose... but it still happens by accident.
3 - One can deploy with a single button press (and there is a system
recording what changes were deployed and why, for ops' convenience).
In the olden days we had exactly that:
svn up && scap
Addition of the deployment branch made it a two-step process -- first you perform a single SVN command to merge changes down, then you do the above command.
4 - When there's trouble, new deploys can be blocked centrally, and then ops can revert to a previous version with a single button press.
That's exactly what the deployment branch was created for -- ensuring that deployed code was in source control meant that you actually *could* return to a previous state.
-- brion