Hi everyone,
Here's where we are with the 1.18 code review and deployment. Short version: * Het Deploy almost done * 482 revs for code review. September 16 completion? * Question: do we have to wait until code review is done before deploying to test2? * Question: could Roan, Tim, Brion and others in Haifa get together and decide a reasonable target date for deployment? (probably late September sometime)
Long version: With code review, many WMF employees have been traveling to conferences (first OSCON, then Wikimania), so that's put a big hole in our capacity, which shows on the graph: http://toolserver.org/~robla/crstats/crstats.118all.html
I've run some very rough numbers over the past month to get a sense of how fast we're reviewing. Here's some different ways of loooking at it: * Worst 7 day performance over the past month: 30 revs/week (4/day) * Best 7 day performance: 320 revs/week (46/day) * Median 7 day performance from 6/29 to today: 114 revs/week (15/day) * Past 30 days: 367 revs (12/day)
As of midnight UTC on August 2, we had 482 revisions to review. Assuming we stay on a slow pace through Wikimania (4/day), then pick up to 12/day starting August 8, that means we'll get through the queue around September 16. If we do a little better (15/day), that's September 8. If we stay at 4/day, it gets grim (December 1). And, just for completeness, if we miraculously pick up to 46/day, that's August 19. Friday, September 16 seems reasonably aggressive while being realistic for a finish date on that activity.
Note that doesn't include revisions marked "fixme". For that list, you can look here: http://www.mediawiki.org/wiki/MediaWiki_roadmap/1.18/Revision_report#Fixmes
On the deployment side, Aaron Schulz has taken over work on Heterogeneous Deploy: http://www.mediawiki.org/wiki/Heterogeneous_deployment
The big development tasks (that we know of) are done. The work Aaron has been doing is deploying bits and pieces at a time, then fixing the bits that break, then deploying more. We had hoped we could get a test instance of 1.18 deployed before Wikimania, but we hit a few snags in finishing up. There are a few bits of deployment lore that Aaron still needs to learn about in order to finish things off, but assuming he's able to do that and get knowledgeable review of some of the riskier code, he should be able to have the Het Deploy infrastructure in place by the end of next week.
The cool thing about Het Deploy is that, once completed, we should be able to deploy MediaWiki 1.18 to a much more production-like test wiki than we've been able to use in the past (e.g. prototype.wikimedia.org), and have it in test for much longer than our current "test.wikimedia.org" site. We'll be able to deploy to both "test.wikimedia.org" and/or "test2.wikimedia.org" prior to releasing to the rest of the cluster. Even better, we'll be able to work with a few pioneering wikis out there to deploy 1.18 to those sites early, then gradually roll it out to a wider audience after we've worked out the kinks with the initial deploys.
There is still some reasonable hesitation in deploying 1.18 even to test or test2 prior to fully reviewing the code. I haven't talked to all of the major production cluster guardians about this, but there is a school of thought that suggests that perhaps we can get away with a lighter skimming prior to deploying everywhere. If it is safe enough to get something out to test2 prior to full review being done, I'd love to do it. Question for those with deployment access: what are your thoughts?
I spoke with Roan before he left San Francisco, and (I think) he agreed to wrangle folks in Haifa to decide on a date, in part based on the data that I just presented. So, I'm hoping he'll get a chance to do that while he's there. We should plan to have 1.18 running reasonably well on test2 for at least 3-4 days before we start deploying to actual production wikis, and we should hold off on deploying to the big wikis (enwiki, dewiki, etc) for another 3-4 days. So, we should probably build at least a week and probably two into the schedule for that. At the risk of biasing everyone, here's my suggestion: * Monday, September 19: deploy to test2 * Thursday, September 22: deploy to a few lower traffic pilot wikis * Monday, September 26: deploy everywhere
That's pretty aggressive given that we're almost certainly going to find bugs during the test2 deployment. I'd feel a lot more confident about a full September 26 rollout if we can get something pushed to test2 sooner than September 19.
Thoughts? Rob
On Wed, Aug 3, 2011 at 2:54 AM, Rob Lanphier robla@wikimedia.org wrote:
- 482 revs for code review. September 16 completion?
Given your data-based approach, that sounds good to me. Also consider that het deploy will be 'done' at some point and Aaron will have more time to spend on review. Besides that, I will have fewer meetings once I start working from home again, and Tim will get back in mid-August (he's been out for most of the period you're analyzing).
- Question: do we have to wait until code review is done before
deploying to test2?
YES. It runs on the production servers. We may relax our standards a little bit (e.g. we don't really need fixme resolution for not-very-critical issues that aren't related to security or performance), but everything still has to at least have been reviewed for security and other do-we-dare-run-this-on-the-cluster issues.
- Question: could Roan, Tim, Brion and others in Haifa get together
and decide a reasonable target date for deployment? (probably late September sometime)
Sure.
Roan
On Wed, Aug 3, 2011 at 3:54 AM, Rob Lanphier robla@wikimedia.org wrote:
So, we should probably build at least a week and probably two into the schedule for that. At the risk of biasing everyone, here's my suggestion:
- Monday, September 19: deploy to test2
- Thursday, September 22: deploy to a few lower traffic pilot wikis
- Monday, September 26: deploy everywhere
Sounds good -- can we get a firm commitment that this is our schedule?
-- brion
On Wed, Aug 3, 2011 at 8:47 AM, Brion Vibber brion@wikimedia.org wrote:
Sounds good -- can we get a firm commitment that this is our schedule?
I think RobLa wants the reverse: a firm commitment from *us* that this is our schedule. It looks good to me too.
Roan
On Tue, Aug 2, 2011 at 11:55 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
On Wed, Aug 3, 2011 at 8:47 AM, Brion Vibber brion@wikimedia.org wrote:
Sounds good -- can we get a firm commitment that this is our schedule?
I think RobLa wants the reverse: a firm commitment from *us* that this is our schedule. It looks good to me too.
Yup, that's basically the idea. What I outlined is conservative in some respects (I suspect many people probably still think we could buckle down and get something out by end of August), but I haven't exactly exhaustively planned for all contingencies. We've got a lot of unaccounted-for fixmes, and there's probably some blocking bugs lurking in the database. Also, there's not really a plan for formal testing, and not a lot of room in that timeline to make it happen.
In short, there is much that could go wrong with this release, but I'm guessing (rather blindly) there's just enough time to make it work if we're committed to make this happen. I'm willing to do my part. Everyone else willing to do theirs?
Rob
wikitech-l@lists.wikimedia.org