On Thu, Oct 21, 2010 at 5:18 PM, Neil Kandalgaonkar <neilk(a)wikimedia.org>wrote;wrote:
The main insight here is that branching is a bad way
for a website to
manage change. We do not have an install base that's out there in the
world, like shrink-wrapped software, where we issue patches on CD. For a
website, we control the entire install base.[1]
Of course MediaWiki is a product for third-party use, too. :)
Doing things the Flickr way entirely would require:
1 - A "feature flag" system, for "branching in code". The point is
to
start developing a new feature with it being turned off by default for
most environments and without succumbing to branching and merging
misery. In other words, day one of a new feature looks like this:
if ( $wgFeature['MyNewThing'] ) {
/* ... new code ... */
} else {
/* ... old code ... */
}
Many features in MediaWiki have been developed in *exactly* this way, either
hidden behind a configuration switch or encapsulated within an extension
which simply isn't enabled until it's ready.
Where this really falls down is where you're refactoring a big subsystem; in
some cases we can keep the entire "new" system separate, and move things
over bit by bit -- and sometimes we've done exactly that -- but it can be
difficult if there are a lot of dependencies that need to be touched because
interfaces are changing. (Think of ResourceLoader and its predecessors as an
example here; lots of little things had to change just to get it in... but
there's still code that uses a lot of old systems just fine and can be
cleaned up bit by bit.)
It falls down more moderately when you're simply "fixing" or
"enhancing"
code, and don't realize that you just introduced some breakage.
2 - Every developer with commit access is thinking
about deployment onto
a cluster of machines all the time. Committing to the repository means
you are asserting this will work in production. (This is the hard part
for us, I think, but maybe not insurmountable).
That's exactly what people are supposed to think when committing to
MediaWiki trunk ever since we switched to the continuous integration w/
quarterly release cycle a few years ago. Breakage in trunk is certainly not
something you're EVER supposed to do on purpose... but it still happens by
accident.
3 - One can deploy with a single button press (and there is a system
recording what changes were deployed and why, for
ops' convenience).
In the olden days we had exactly that:
svn up && scap
Addition of the deployment branch made it a two-step process -- first you
perform a single SVN command to merge changes down, then you do the above
command.
4 - When there's trouble, new deploys can be
blocked centrally, and then
ops can revert to a previous version with a single button press.
That's exactly what the deployment branch was created for -- ensuring that
deployed code was in source control meant that you actually *could* return
to a previous state.
-- brion