Hi everyone,
Here's where we are with the 1.18 code review and deployment. Short version: * Het Deploy almost done * 482 revs for code review. September 16 completion? * Question: do we have to wait until code review is done before deploying to test2? * Question: could Roan, Tim, Brion and others in Haifa get together and decide a reasonable target date for deployment? (probably late September sometime)
Long version: With code review, many WMF employees have been traveling to conferences (first OSCON, then Wikimania), so that's put a big hole in our capacity, which shows on the graph: http://toolserver.org/~robla/crstats/crstats.118all.html
I've run some very rough numbers over the past month to get a sense of how fast we're reviewing. Here's some different ways of loooking at it: * Worst 7 day performance over the past month: 30 revs/week (4/day) * Best 7 day performance: 320 revs/week (46/day) * Median 7 day performance from 6/29 to today: 114 revs/week (15/day) * Past 30 days: 367 revs (12/day)
As of midnight UTC on August 2, we had 482 revisions to review. Assuming we stay on a slow pace through Wikimania (4/day), then pick up to 12/day starting August 8, that means we'll get through the queue around September 16. If we do a little better (15/day), that's September 8. If we stay at 4/day, it gets grim (December 1). And, just for completeness, if we miraculously pick up to 46/day, that's August 19. Friday, September 16 seems reasonably aggressive while being realistic for a finish date on that activity.
Note that doesn't include revisions marked "fixme". For that list, you can look here: http://www.mediawiki.org/wiki/MediaWiki_roadmap/1.18/Revision_report#Fixmes
On the deployment side, Aaron Schulz has taken over work on Heterogeneous Deploy: http://www.mediawiki.org/wiki/Heterogeneous_deployment
The big development tasks (that we know of) are done. The work Aaron has been doing is deploying bits and pieces at a time, then fixing the bits that break, then deploying more. We had hoped we could get a test instance of 1.18 deployed before Wikimania, but we hit a few snags in finishing up. There are a few bits of deployment lore that Aaron still needs to learn about in order to finish things off, but assuming he's able to do that and get knowledgeable review of some of the riskier code, he should be able to have the Het Deploy infrastructure in place by the end of next week.
The cool thing about Het Deploy is that, once completed, we should be able to deploy MediaWiki 1.18 to a much more production-like test wiki than we've been able to use in the past (e.g. prototype.wikimedia.org), and have it in test for much longer than our current "test.wikimedia.org" site. We'll be able to deploy to both "test.wikimedia.org" and/or "test2.wikimedia.org" prior to releasing to the rest of the cluster. Even better, we'll be able to work with a few pioneering wikis out there to deploy 1.18 to those sites early, then gradually roll it out to a wider audience after we've worked out the kinks with the initial deploys.
There is still some reasonable hesitation in deploying 1.18 even to test or test2 prior to fully reviewing the code. I haven't talked to all of the major production cluster guardians about this, but there is a school of thought that suggests that perhaps we can get away with a lighter skimming prior to deploying everywhere. If it is safe enough to get something out to test2 prior to full review being done, I'd love to do it. Question for those with deployment access: what are your thoughts?
I spoke with Roan before he left San Francisco, and (I think) he agreed to wrangle folks in Haifa to decide on a date, in part based on the data that I just presented. So, I'm hoping he'll get a chance to do that while he's there. We should plan to have 1.18 running reasonably well on test2 for at least 3-4 days before we start deploying to actual production wikis, and we should hold off on deploying to the big wikis (enwiki, dewiki, etc) for another 3-4 days. So, we should probably build at least a week and probably two into the schedule for that. At the risk of biasing everyone, here's my suggestion: * Monday, September 19: deploy to test2 * Thursday, September 22: deploy to a few lower traffic pilot wikis * Monday, September 26: deploy everywhere
That's pretty aggressive given that we're almost certainly going to find bugs during the test2 deployment. I'd feel a lot more confident about a full September 26 rollout if we can get something pushed to test2 sooner than September 19.
Thoughts? Rob