Hi everyone,
There have been a number of calls to make the release process more predictable (or maybe just faster). There are plenty of examples of projects that have very predictable release schedules, such as the GNOME project or the Ubuntu Linux distribution. It's not at all unreasonable to expect that we could achieve that same level of predictability if we're prepared to make some tradeoffs, such as:
1. Is the release cadence is more important (i.e. reverting features if they pose a schedule risk) or is shipping a set of features is important (i.e. slipping the date if one of the predetermined feature isn't ready)? For example, as pointed out in another thread + IRC, there was a suggestion for creating a branch point prior to the introduction of the Resource Loader.[1] Is our priority going to be about ensuring a fixed list of features is ready to go, or should we be ruthless about cutting features to make a date, even if there isn't much left on the feature list for that date? 2. Projects with generally predictable schedules also have a process for deciding early in the cycle what is going to be in the release. For example, in Ubuntu's most recently completed release schedule [2], they alloted a little over 23 weeks for development (a little over 5 months). The release team slated a "Feature Definition Freeze" on June 17 (week 7), with what I understand was a pretty high bar for getting new features listed after that, and a feature freeze on August 12 (week 15). Many features originally slated in the feature definition were cut. Right now, we have nothing approaching that level of formality. Should we? 3. How deep is the belief that Wikimedia production deployment must precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
Thoughts on these? Any other tradeoffs we need to consider? We're going to have a number of conversations over the coming days on this topic, so I wanted to add a little structure and get some (more) initial impressions now.
Rob
[1] MZMcBride's mail: http://lists.wikimedia.org/pipermail/wikitech-l/2010-October/049969.html ...which in turn references IRC from 2010-10-18 @ 14:08 or so: http://toolserver.org/~mwbot/logs/%23mediawiki/20101018.txt [2] Ubuntu Maverick Meerkat (10.10) release schedule: https://wiki.ubuntu.com/MaverickReleaseSchedule
On Thu, Oct 21, 2010 at 7:56 AM, Rob Lanphier robla@wikimedia.org wrote:
- Is the release cadence is more important (i.e. reverting features
if they pose a schedule risk) or is shipping a set of features is important (i.e. slipping the date if one of the predetermined feature isn't ready)? For example, as pointed out in another thread + IRC, there was a suggestion for creating a branch point prior to the introduction of the Resource Loader.[1] Is our priority going to be about ensuring a fixed list of features is ready to go, or should we be ruthless about cutting features to make a date, even if there isn't much left on the feature list for that date?
I'm afraid that branching before RL merge is not going to help in the present state of affairs. We have a zillion of unreviewed and untested revisions before that, so maintaining two branches will require us to virtually double the efforts.
- Projects with generally predictable schedules also have a process
for deciding early in the cycle what is going to be in the release. For example, in Ubuntu's most recently completed release schedule [2], they alloted a little over 23 weeks for development (a little over 5 months). The release team slated a "Feature Definition Freeze" on June 17 (week 7), with what I understand was a pretty high bar for getting new features listed after that, and a feature freeze on August 12 (week 15). Many features originally slated in the feature definition were cut. Right now, we have nothing approaching that level of formality. Should we?
Obviously, we're not ready to determine the exact date of 1.17 release, because we worked on it (and are continuing doing so) without a set date in mind. The question is what we should do to make things more predictable for 1.18. When we see how well that goes on, we could decide how strict we want our schedule to be - IMHO, Ubuntu's way results in buggy releases, as people reported some blatantly stupid regressions in 10.10.
- How deep is the belief that Wikimedia production deployment must
precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
I believe that every developer believes so.
Thoughts on these? Any other tradeoffs we need to consider? We're
going to have a number of conversations over the coming days on this
topic, so I wanted to add a little structure and get some (more)
initial impressions now.
Can these discussions be made accessible to those of us who will not be present? A skypecast would be ideal, but simpler ways would do, including text transcripts.
On Wed, Oct 20, 2010 at 11:56 PM, Rob Lanphier robla@wikimedia.org wrote:
- Is the release cadence is more important (i.e. reverting features
if they pose a schedule risk) or is shipping a set of features is important (i.e. slipping the date if one of the predetermined feature isn't ready)? For example, as pointed out in another thread + IRC, there was a suggestion for creating a branch point prior to the introduction of the Resource Loader.[1] Is our priority going to be about ensuring a fixed list of features is ready to go, or should we be ruthless about cutting features to make a date, even if there isn't much left on the feature list for that date?
IMO, the best release approach is to set a timeline for branching and then release the branch when it's done. This is basically how the Linux kernel works, for example, and how MediaWiki historically worked up to about 1.15. We'd branch every three months, then give it a while to stabilize before making an RC, then make however many RCs were necessary to stabilize. This gives pretty predictable release schedules in practice (until releases fell by the wayside for us after 1.15 or so), but not anything that we're forced to commit to.
(Actually, Linux differs a lot, because the official repository has a brief merge window followed by a multi-month code freeze, and actual development occurs in dozens of different trees managed by different people on their own schedules. But as far as the release schedule goes, it's "branch on a consistent timeframe and then release when it's ready", with initial branching time-based but release entirely unconstrained. So in that respect it's similar to how we used to do things.)
I don't think it's a good idea to insist on an exact release date, as Ubuntu does, or even to set an exact release date at all. That could force us to release with significant regressions if they come up at the last minute. On the other hand, I don't see any real benefits. Does anyone care exactly when MediaWiki is released? If so, why can't they just use RCs? The RC tarball is just as easy to unpack as the release tarball.
I also don't think it makes any sense for us to do feature-based releases. The way that would work is to decide on what features you want in the release, then allocate resources to get those features done in time. But Wikimedia currently doesn't use the releases, it deploys new features continually. So resources will naturally not be targeted at the release date, they'll be targeted for deployment whenever they're done. Wikimedia has no big reason to pay people to rush to complete something in time for a release that it isn't going to use anyway.
Furthermore, even if Wikimedia did use releases -- IIRC, you thought that was a reasonable plan when this came up before -- I still think feature-based releases are a bad idea. It encourages you to either delay releases excessively or ship half-baked features. If you instead say that you'll ship whatever is mature at the time of release, with no commitment to what makes it in, it encourages more focus on correctness and quality. Feature-based releases really only belong in the proprietary software world, where the vendor needs a feature list to encourage people to pay for the new version.
- Projects with generally predictable schedules also have a process
for deciding early in the cycle what is going to be in the release. For example, in Ubuntu's most recently completed release schedule [2], they alloted a little over 23 weeks for development (a little over 5 months). The release team slated a "Feature Definition Freeze" on June 17 (week 7), with what I understand was a pretty high bar for getting new features listed after that, and a feature freeze on August 12 (week 15). Many features originally slated in the feature definition were cut. Right now, we have nothing approaching that level of formality. Should we?
IMO, no. I think it's best to just ship whatever's done when the release branch is made. Processes like Ubuntu or Mozilla have only make sense when the organization paying for development is primarily interested in the actual release, not when the organization is primarily interested in its own use of the product. In the latter case, it makes much more sense to do incremental development and deployment and do releases mostly as an afterthought.
Wikimedia is in an unusual position here, really. Very few sites that pay for in-house code development for their own use then make real open-source releases of it. Either they keep it closed or just throw source over the wall occasionally, or they're interested mostly in getting third parties to use it. I'm not personally familiar with other open-source projects in a similar position to us, although they exist (like StatusNet?). We have to be careful with analogies to software development that's dissimilar in purpose to ours.
- How deep is the belief that Wikimedia production deployment must
precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
IMO, it's essential that Wikimedia get back to incrementally deploying trunk instead of a separate branch. Wikipedia is a great place to test new features, and we're in a uniquely good position to do so, since we wrote the code and can very quickly fix any reported bugs. Wikipedia users are also much more aware of MediaWiki development and much more likely to know who to report bugs to. I think any site that's in a position to use its own software (even if it's closed-source) should deploy it first internally, and if I'm not mistaken, this is actually a very common practice.
Beyond that, this development model also gives volunteers immediate reward for their efforts, in that they can see their new code live within a few days. When a Wikipedia user reports a bug, it's very satisfying to be able to say "Fixed in rXXXXX, you should see the fix within a week". It's just not the same if the fix won't be deployed for months.
On 10/21/10 2:16 PM, Aryeh Gregor wrote:
- How deep is the belief that Wikimedia production deployment must
precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
IMO, it's essential that Wikimedia get back to incrementally deploying trunk instead of a separate branch.
I agree with this very strongly.
I would like to know what (if any) arguments there are for doing a separate deploy branch. It seems to me that we ought to be deploying constantly to the website, and making occasional MediaWiki branch releases. On the projects we want timeliness, and downstream MediaWiki packagers want stability,
For what it's worth, I'm influenced by my former job at Flickr, where the practice was to deploy several times *per day*, directly from trunk. That may be more extreme than we want but be aware there are people who are doing it successfully -- it just takes a few extra development practices.
BTW, I wanted to say this stuff earlier, but I found that I couldn't respond meaningfully to Rob's questions. They were "big questions" asking for a lot of context, and a relative newcomer like me is a bit intimidated by those.
On Thu, Oct 21, 2010 at 6:31 PM, Neil Kandalgaonkar neilk@wikimedia.org wrote:
For what it's worth, I'm influenced by my former job at Flickr, where the practice was to deploy several times *per day*, directly from trunk. That may be more extreme than we want but be aware there are people who are doing it successfully -- it just takes a few extra development practices.
Personally, I think it would be awesome if we could migrate to this level of deployment frequency eventually. I imagine that comprehensive automated test suites are a major part of making this reliable. To the extent you can share any details about how stuff works at Flickr, what long-term changes are necessary for this to be practical?
On Thu, Oct 21, 2010 at 6:35 PM, Platonides Platonides@gmail.com wrote:
However I completely agree with Aryeh on the importance of wmf running almost trunk. The process itself could be automated, eg. a cron job automatically branching from trunk each Tuesday morning, and having the deploy programmed for Thursday. NB: I'm assuming a model where everyone can commit to the branch in the meantime.
We shouldn't branch at all for routine deployments of trunk. Just make sure everything looks good, maybe revert or temporarily disable anything that hasn't seen enough testing, then deploy current trunk. That way we don't have to worry about backporting.
On 10/21/10 4:04 PM, Aryeh Gregor wrote:
On Thu, Oct 21, 2010 at 6:31 PM, Neil Kandalgaonkarneilk@wikimedia.org wrote:
For what it's worth, I'm influenced by my former job at Flickr, where the practice was to deploy several times *per day*, directly from trunk. That may be more extreme than we want but be aware there are people who are doing it successfully -- it just takes a few extra development practices.
Personally, I think it would be awesome if we could migrate to this level of deployment frequency eventually. I imagine that comprehensive automated test suites are a major part of making this reliable.
Nope. Automated tests help a lot with this approach but Flickr doesn't have much better tests than MediaWiki does.
We *should* have better tests, but I would just say that it is not required for us to have a great test suite before doing this.
To the extent you can share any details about how stuff works at Flickr, what long-term changes are necessary for this to be practical?
Flickr engineers have already talked a lot about this in public. See references below.
The main insight here is that branching is a bad way for a website to manage change. We do not have an install base that's out there in the world, like shrink-wrapped software, where we issue patches on CD. For a website, we control the entire install base.[1]
What we need are ways of managing change across our server clusters, or managing incremental feature and infrastructure upgrades. This leads to "branching in code".
Doing things the Flickr way entirely would require:
1 - A "feature flag" system, for "branching in code". The point is to start developing a new feature with it being turned off by default for most environments and without succumbing to branching and merging misery. In other words, day one of a new feature looks like this:
if ( $wgFeature['MyNewThing'] ) { /* ... new code ... */ } else { /* ... old code ... */ }
Of course if you're fixing bugs there's no need to hide that behind a feature flag.
2 - Every developer with commit access is thinking about deployment onto a cluster of machines all the time. Committing to the repository means you are asserting this will work in production. (This is the hard part for us, I think, but maybe not insurmountable).
3 - One can deploy with a single button press (and there is a system recording what changes were deployed and why, for ops' convenience).
4 - When there's trouble, new deploys can be blocked centrally, and then ops can revert to a previous version with a single button press.
5 - Developers are good about "cleaning up" code that was previously protected by feature flags once the behaviour is standard. (HINT: this is the part Flickr doesn't talk about in public... but as an open source project with more visible dirty laundry, perhaps we can do better.)
This system does result in more "oops" moments. But the point is to make those easy to recover from, and to have a culture where people aren't blamed too much for this. Not to make a system that tries to ensure that deploy branches can be tested to be almost perfect. The real problems are always things that nobody anticipated anyway.
NOTES
[1] I am for the purposes of the argument ignoring MediaWiki as a deliverable and only thinking about project websites.
REFERENCES
Here's the most concise presentation: "Always Ship Trunk: Managing Change In Complex Websites" by Paul Hammond http://www.paulhammond.org/2010/06/trunk/alwaysshiptrunk.pdf
And a longer talk about all this from Paul Hammond and John Allspaw 10+ Deploys Per Day: Dev/Ops Cooperation at Flickr http://velocityconference.blip.tv/file/2284377/
Blog post about the Feature Flag system by Ross Harmes "Flipping out" http://code.flickr.com/blog/2009/12/02/flipping-out/
On Thu, Oct 21, 2010 at 5:18 PM, Neil Kandalgaonkar neilk@wikimedia.orgwrote:
The main insight here is that branching is a bad way for a website to manage change. We do not have an install base that's out there in the world, like shrink-wrapped software, where we issue patches on CD. For a website, we control the entire install base.[1]
Of course MediaWiki is a product for third-party use, too. :)
Doing things the Flickr way entirely would require:
1 - A "feature flag" system, for "branching in code". The point is to start developing a new feature with it being turned off by default for most environments and without succumbing to branching and merging misery. In other words, day one of a new feature looks like this:
if ( $wgFeature['MyNewThing'] ) { /* ... new code ... */ } else { /* ... old code ... */ }
Many features in MediaWiki have been developed in *exactly* this way, either hidden behind a configuration switch or encapsulated within an extension which simply isn't enabled until it's ready.
Where this really falls down is where you're refactoring a big subsystem; in some cases we can keep the entire "new" system separate, and move things over bit by bit -- and sometimes we've done exactly that -- but it can be difficult if there are a lot of dependencies that need to be touched because interfaces are changing. (Think of ResourceLoader and its predecessors as an example here; lots of little things had to change just to get it in... but there's still code that uses a lot of old systems just fine and can be cleaned up bit by bit.)
It falls down more moderately when you're simply "fixing" or "enhancing" code, and don't realize that you just introduced some breakage.
2 - Every developer with commit access is thinking about deployment onto a cluster of machines all the time. Committing to the repository means you are asserting this will work in production. (This is the hard part for us, I think, but maybe not insurmountable).
That's exactly what people are supposed to think when committing to MediaWiki trunk ever since we switched to the continuous integration w/ quarterly release cycle a few years ago. Breakage in trunk is certainly not something you're EVER supposed to do on purpose... but it still happens by accident.
3 - One can deploy with a single button press (and there is a system
recording what changes were deployed and why, for ops' convenience).
In the olden days we had exactly that:
svn up && scap
Addition of the deployment branch made it a two-step process -- first you perform a single SVN command to merge changes down, then you do the above command.
4 - When there's trouble, new deploys can be blocked centrally, and then ops can revert to a previous version with a single button press.
That's exactly what the deployment branch was created for -- ensuring that deployed code was in source control meant that you actually *could* return to a previous state.
-- brion
On Fri, Oct 22, 2010 at 2:18 AM, Neil Kandalgaonkar neilk@wikimedia.org wrote:
if ( $wgFeature['MyNewThing'] ) { /* ... new code ... */ } else { /* ... old code ... */ }
The Aryeh method :) I must say I like it a lot.
On Thu, Oct 21, 2010 at 8:18 PM, Neil Kandalgaonkar neilk@wikimedia.org wrote:
Nope. Automated tests help a lot with this approach but Flickr doesn't have much better tests than MediaWiki does.
We *should* have better tests, but I would just say that it is not required for us to have a great test suite before doing this.
Interesting.
Doing things the Flickr way entirely would require:
1 - A "feature flag" system, for "branching in code". The point is to start developing a new feature with it being turned off by default for most environments and without succumbing to branching and merging misery. In other words, day one of a new feature looks like this:
if ( $wgFeature['MyNewThing'] ) { /* ... new code ... */ } else { /* ... old code ... */ }
Of course if you're fixing bugs there's no need to hide that behind a feature flag.
I always do this anyway, personally (as apparently Bryan noticed). Sometimes it can get cumbersome and hard to maintain, but thankfully it means I've never had to learn how to use SVN branches. :)
2 - Every developer with commit access is thinking about deployment onto a cluster of machines all the time. Committing to the repository means you are asserting this will work in production. (This is the hard part for us, I think, but maybe not insurmountable).
This was true when we had regular deployments too. Anything that wasn't ready for production yet would just get reverted. TranslateWiki still runs on trunk, and its developers vigorously prod anyone who breaks trunk. So I think we're okay on this score.
3 - One can deploy with a single button press (and there is a system recording what changes were deployed and why, for ops' convenience).
We have something like this . . .
4 - When there's trouble, new deploys can be blocked centrally, and then ops can revert to a previous version with a single button press.
. . . not this, I don't think, but doesn't sound too hard.
5 - Developers are good about "cleaning up" code that was previously protected by feature flags once the behaviour is standard. (HINT: this is the part Flickr doesn't talk about in public... but as an open source project with more visible dirty laundry, perhaps we can do better.)
This doesn't seem like it should be hard.
Anyway, it's something to think about once we get review and deployment caught up. Maybe we should do daily deployment instead of weekly, or even multiple-times-daily.
On Thu, Oct 21, 2010 at 3:31 PM, Neil Kandalgaonkar neilk@wikimedia.orgwrote:
On 10/21/10 2:16 PM, Aryeh Gregor wrote:
- How deep is the belief that Wikimedia production deployment must
precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
IMO, it's essential that Wikimedia get back to incrementally deploying trunk instead of a separate branch.
I agree with this very strongly.
I would like to know what (if any) arguments there are for doing a separate deploy branch. It seems to me that we ought to be deploying constantly to the website, and making occasional MediaWiki branch releases. On the projects we want timeliness, and downstream MediaWiki packagers want stability,
Original announcement thread from July 2009: http://www.mail-archive.com/wikitech-l@lists.wikimedia.org/msg03903.html
The original purpose of having a deployment branch was so that we actually knew what we were running! :) Even with fairly regular deployments from trunk, we had two big problems:
1) "live hacks" -- little tweaks, patches, and one-off hacks in the live code to work around temporary problems. These would accumulate over time and eventually we'd end up with surprise merging problems at deployment time, or just forgetting to merge important bits of code back into trunk... sometimes hacks that should have been kept got even accidentally removed when a new version got pulled in!
Knowing that was was in deployment was *exactly* what was in SVN means that we know a) where they came from b) when and by whom they were committed and c) allows folks to easily see the difference between trunk and deployment and make sure that important work is in fact merged back.
In theory, live hacks are punishable by eternal torture in the bowels of SVN branching. In practice, they'll happen as long as it's _possible_ to deploy code that's not in SVN.
2) Temporary breakages on trunk right in the middle of an important quick fix
If we don't do those one-off fixes, workarounds, and debugging hacks as live hacks, the alternative without a deployment branch is to actually do them *on* trunk. That means that when you want to slap in a one-line tweak to fix or debug something, you *also* have to deploy the last few days' worth of trunk changes.
Hopefully there are no regressions or incompatible changes. Right? Right? :)
But ultimately, wmf-deployment was never intended to diverge from trunk by more than a couple weeks in regular usage; I was aiming for a weekly or biweekly deployment schedule.
With the sort of backlog we've developed during the long slog of stabilizing the new JavaScript layers, they've ended up HUGEly divergent, which is very unpleasant -- especially with SVN's primitive merging systems.
For what it's worth, I'm influenced by my former job at Flickr, where
the practice was to deploy several times *per day*, directly from trunk. That may be more extreme than we want but be aware there are people who are doing it successfully -- it just takes a few extra development practices.
If in-progress work were done on branches and merged to trunk when stable, that would be a grand way to go. That's a big pain in the butt with SVN's branching, unfortunately, but much easier with git.
-- brion
On 10/21/10 4:23 PM, Brion Vibber wrote:
The original purpose of having a deployment branch was so that we actually knew what we were running! :) Even with fairly regular deployments from trunk, we had two big problems:
- "live hacks" -- little tweaks, patches, and one-off hacks in the live
code to work around temporary problems.
In theory, live hacks are punishable by eternal torture in the bowels of SVN branching. In practice, they'll happen as long as it's _possible_ to deploy code that's not in SVN.
I feel that this has to be a symptom of some other problem. What sort of things go into "live hacks"?
If they are about rapidly reconfiguring, rolling back, or turning off features, I think that's better answered by having an explicit system to do such a thing (see my other post in this thread about Flickr's system).
- Temporary breakages on trunk right in the middle of an important quick
fix
If we don't do those one-off fixes, workarounds, and debugging hacks as live hacks, the alternative without a deployment branch is to actually do them *on* trunk. That means that when you want to slap in a one-line tweak to fix or debug something, you *also* have to deploy the last few days' worth of trunk changes.
Yes, definitely a problem. In the Flickr world, you're never more than a few hours off of trunk anyway; but we're not in that world, so we start to feel the need for a deploy branch.
On Thu, Oct 21, 2010 at 5:28 PM, Neil Kandalgaonkar neilk@wikimedia.orgwrote:
I feel that this has to be a symptom of some other problem. What sort of things go into "live hacks"?
If they are about rapidly reconfiguring, rolling back, or turning off features, I think that's better answered by having an explicit system to do such a thing (see my other post in this thread about Flickr's system).
Primarily:
1) debug logging statements to provide additional information on problems seen in production that can't yet be reproduced offline 2) temporary performance hacks to disable individual code paths in particular circumstances (say, the caching bug that caused serious cache contention on the 'Michael Jackson' article one day) -- these are usually not "features" but more like "this chunk of processing for this feature when used in a very particular way on this one article" 3) horrible, horrible temporary hacks to block particularly unpleasant actions or make exceptions for something that other code doesn't yet allow.
These are usually done live because live because whatever you're reacting to is live -- the code is part of a production debugging session.
Debug logging hacks usually are discardable immediately. Performance hacks usually need to be maintained or replaced with better code -- these are the ones we had to worry about not accidentally losing by replacing the live deployment with code from trunk. :) Temporary hacks to disable or enable things or help catch vandalism are sort of an in-between space.
-- brion
Aryeh Gregor wrote:
On Wed, Oct 20, 2010 at 11:56 PM, Rob Lanphier robla@wikimedia.org wrote:
- Is the release cadence is more important (i.e. reverting features
if they pose a schedule risk) or is shipping a set of features is important (i.e. slipping the date if one of the predetermined feature isn't ready)? For example, as pointed out in another thread + IRC, there was a suggestion for creating a branch point prior to the introduction of the Resource Loader.[1] Is our priority going to be about ensuring a fixed list of features is ready to go, or should we be ruthless about cutting features to make a date, even if there isn't much left on the feature list for that date?
IMO, the best release approach is to set a timeline for branching and then release the branch when it's done. This is basically how the Linux kernel works, for example, and how MediaWiki historically worked up to about 1.15. We'd branch every three months, then give it a while to stabilize before making an RC, then make however many RCs were necessary to stabilize. This gives pretty predictable release schedules in practice (until releases fell by the wayside for us after 1.15 or so), but not anything that we're forced to commit to.
(Actually, Linux differs a lot, because the official repository has a brief merge window followed by a multi-month code freeze, and actual development occurs in dozens of different trees managed by different people on their own schedules. But as far as the release schedule goes, it's "branch on a consistent timeframe and then release when it's ready", with initial branching time-based but release entirely unconstrained. So in that respect it's similar to how we used to do things.)
I don't think it's a good idea to insist on an exact release date, as Ubuntu does, or even to set an exact release date at all.
+1. Fuzzy dates are good, but setting a fixed date is not. This doesn't mean that WMF shouldn't be more lazy in allocating resources for the release, though.
Does anyone care exactly when MediaWiki is released? If so, why can't they just use RCs? The RC tarball is just as easy to unpack as the release tarball.
Because RC have that unstable feeling. So many people end up not testing the RCs, which make wmf deploys much more important.
I also don't think it makes any sense for us to do feature-based releases. The way that would work is to decide on what features you want in the release, then allocate resources to get those features done in time.
We have had too much chaotic releases. I don't think we should aim for release delaying features for now. It's fine planning a set of features, or tweaking a bit the dates to stabilize some feature / release before a branch merge.
Plus, we don't have such big features missing. A normal release wil lbe just a lot of small fixes and tiny new features. We have a number of them for 1.17 but that's an anomaly (and due to the delay).
- How deep is the belief that Wikimedia production deployment must
precede a MediaWiki tarball release? Put another way, how tightly are they coupled?
IMO, it's essential that Wikimedia get back to incrementally deploying trunk instead of a separate branch. Wikipedia is a great place to test new features, and we're in a uniquely good position to do so, since we wrote the code and can very quickly fix any reported bugs. Wikipedia users are also much more aware of MediaWiki development and much more likely to know who to report bugs to. I think any site that's in a position to use its own software (even if it's closed-source) should deploy it first internally, and if I'm not mistaken, this is actually a very common practice.
I consider that very important. Specially for a big release such as the upcoming one. A WMF deployment will get it more tested in a few weeks than many months by normal third-party users (specially in feedback terms).
I don't oppose to having a wmf branch. It comes from the admission that there are live patches, and having a branch actually documents them and allow us to see what is really deployed. However I completely agree with Aryeh on the importance of wmf running almost trunk. The process itself could be automated, eg. a cron job automatically branching from trunk each Tuesday morning, and having the deploy programmed for Thursday. NB: I'm assuming a model where everyone can commit to the branch in the meantime.
Beyond that, this development model also gives volunteers immediate reward for their efforts, in that they can see their new code live within a few days. When a Wikipedia user reports a bug, it's very satisfying to be able to say "Fixed in rXXXXX, you should see the fix within a week". It's just not the same if the fix won't be deployed for months.
+10.
wikitech-l@lists.wikimedia.org