Hi everyone,
In case you missed it on the techblog, here's an update on the revised deployment plan for 1.17, part 1 of which starts in 7 hours: http://techblog.wikimedia.org/2011/02/1-17deployment-attempt2/
Also copied below.
Rob ------
As covered on this blog this week, we had a few problems with our initial deployment of 1.17 to the Wikimedia cluster of servers. We’ve investigated the problems, and believe we have fixed many of the issues. Some of the unsolved issues are complicated enough that the only timely and reasonable way to investigate them is to deploy and react, so we’ve come up with a plan that lets us do it in a safe way by deploying on just a few wikis at a time (as opposed to all at once, as we tried earlier).
We’re scheduling two deployment windows:
First window – This wave will be deployed between Friday, February 11, 6:00 UTC – 12:00 UTC (10pm PST Thursday, February 10 in San Francisco). This first wave will be to a limited set of wikis (see below). Second window – Wednesday February 16 (between 6:00 UTC – 12:00 UTC) – full deployment (tentative) Repeating what is new about 1.17: There are many, many little fixes and improvements (see the draft release notes for an exhaustive list), as well as one larger improvement: Resource Loader. Read more in the previous 1.17 deployment announcement.
First window This first deployment window will be to a limited set of wikis:
http://simple.wikipedia.org/ (simplewiki) http://simple.wiktionary.org/ (simplewiktionary) http://usability.wikimedia.org/ (usabilitywiki) http://strategy.wikimedia.org/ (strategywiki) http://meta.wikimedia.org/ (metawiki) http://eo.wikipedia.org/ (eowiki) http://en.wikiquote.org/ (enwikiquote) http://en.wikinews.org/ (enwikinews) http://en.wikibooks.org/ (enwikibooks) http://beta.wikiversity.org (betawikiversity) http://nl.wikipedia.org (nlwiki) Note that the point of this first round of wikis being switched over is to be able to observe the problem or problems without overloading the site and bringing it down. This deployment should be small enough in scope that even if there are moderate performance problems, no one should notice without watching our monitoring tools. We may not roll out to every wiki listed above during the first wave, but we plan to roll out to enough of them that we can gather enough debugging information to make the second wave (full deployment) go smoothly.
Second window We will continue to roll this out to the rest of the wikis during this window. Depending on our confidence level, we may deploy to the remaining wikis, or we may decide to deploy to a portion of the remaining wikis. If necessary, we will schedule another window to finish the deployment.
Technical details Here’s some more technical detail: one problem with the original Tuesday deploy was that the cache miss rate went up quite substantially. We believe the problem was a problem with the configuration of the $wgCacheEpoch variable, which caused more aggressive culling of our cache than the servers could handle. We have made adjustments, and so this shouldn’t be a problem during our next deployment attempt.
The $wgCacheEpoch problem explains some of the problems we had, but not all of them. Since we don’t have a clear explanation for all of the problems, we plan to modify the way we deploy this software so that we aren’t rolling this out to every wiki simultaneously. As our software is currently built, this isn’t easy to do in a general way, but it turns out this release is suited to an incremental deployment. (Note: we also plan to develop a more general capacity to roll out incrementally for future releases).
Thank you for your patience! We hope that this time around we can deploy this in a way that you won’t notice anything other than the improvements.
Thank you for this.
Will ops staff be monitoring wikitech-l for email reports of observed problems from those of us who are IRC-impaired? Is there an another preferred non-IRC channel for reports?
Thanks.
On Thu, Feb 10, 2011 at 2:59 PM, Rob Lanphier robla@wikimedia.org wrote:
Hi everyone,
In case you missed it on the techblog, here's an update on the revised deployment plan for 1.17, part 1 of which starts in 7 hours: http://techblog.wikimedia.org/2011/02/1-17deployment-attempt2/
Also copied below.
Rob
As covered on this blog this week, we had a few problems with our initial deployment of 1.17 to the Wikimedia cluster of servers. We’ve investigated the problems, and believe we have fixed many of the issues. Some of the unsolved issues are complicated enough that the only timely and reasonable way to investigate them is to deploy and react, so we’ve come up with a plan that lets us do it in a safe way by deploying on just a few wikis at a time (as opposed to all at once, as we tried earlier).
We’re scheduling two deployment windows:
First window – This wave will be deployed between Friday, February 11, 6:00 UTC – 12:00 UTC (10pm PST Thursday, February 10 in San Francisco). This first wave will be to a limited set of wikis (see below). Second window – Wednesday February 16 (between 6:00 UTC – 12:00 UTC) – full deployment (tentative) Repeating what is new about 1.17: There are many, many little fixes and improvements (see the draft release notes for an exhaustive list), as well as one larger improvement: Resource Loader. Read more in the previous 1.17 deployment announcement.
First window This first deployment window will be to a limited set of wikis:
http://simple.wikipedia.org/ (simplewiki) http://simple.wiktionary.org/ (simplewiktionary) http://usability.wikimedia.org/ (usabilitywiki) http://strategy.wikimedia.org/ (strategywiki) http://meta.wikimedia.org/ (metawiki) http://eo.wikipedia.org/ (eowiki) http://en.wikiquote.org/ (enwikiquote) http://en.wikinews.org/ (enwikinews) http://en.wikibooks.org/ (enwikibooks) http://beta.wikiversity.org (betawikiversity) http://nl.wikipedia.org (nlwiki) Note that the point of this first round of wikis being switched over is to be able to observe the problem or problems without overloading the site and bringing it down. This deployment should be small enough in scope that even if there are moderate performance problems, no one should notice without watching our monitoring tools. We may not roll out to every wiki listed above during the first wave, but we plan to roll out to enough of them that we can gather enough debugging information to make the second wave (full deployment) go smoothly.
Second window We will continue to roll this out to the rest of the wikis during this window. Depending on our confidence level, we may deploy to the remaining wikis, or we may decide to deploy to a portion of the remaining wikis. If necessary, we will schedule another window to finish the deployment.
Technical details Here’s some more technical detail: one problem with the original Tuesday deploy was that the cache miss rate went up quite substantially. We believe the problem was a problem with the configuration of the $wgCacheEpoch variable, which caused more aggressive culling of our cache than the servers could handle. We have made adjustments, and so this shouldn’t be a problem during our next deployment attempt.
The $wgCacheEpoch problem explains some of the problems we had, but not all of them. Since we don’t have a clear explanation for all of the problems, we plan to modify the way we deploy this software so that we aren’t rolling this out to every wiki simultaneously. As our software is currently built, this isn’t easy to do in a general way, but it turns out this release is suited to an incremental deployment. (Note: we also plan to develop a more general capacity to roll out incrementally for future releases).
Thank you for your patience! We hope that this time around we can deploy this in a way that you won’t notice anything other than the improvements.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
George Herbert writes:
Is there an another preferred non-IRC channel for reports?
I've been monitoring Bugzilla and some IRC reports (which I've then been asking people to put into Bugzilla) and apart from a few problems like layout and intermittent reports of problems with CentralAuth, I think this rollout has been successful.
We plan on rapidly addressing these problems and finishing up the rollout.
For those who are curious, we have the following tracking bugs:
Bugs that should be fixed for 1.17 WMF deployment https://bugzilla.wikimedia.org/show_bug.cgi?id=26611
Bugs that should be fixed post 1.17 WMF deployment https://bugzilla.wikimedia.org/show_bug.cgi?id=27339
Bugs that should be fixed for 1.17 release tarball https://bugzilla.wikimedia.org/show_bug.cgi?id=26676
2011/2/11 Rob Lanphier robla@wikimedia.org:
First window This first deployment window will be to a limited set of wikis:
http://simple.wikipedia.org/ (simplewiki) http://simple.wiktionary.org/ (simplewiktionary) http://usability.wikimedia.org/ (usabilitywiki)
<snip>
How were these selected?
None of them is RTL (right-to-left). I would very much like to test it in an actual RTL wiki, not just a prototype. During the first 1.17 upgrade attempt there were some minor RTL issues in he.wikipedia. Not terrible, but it would be nice to prevent them.
Can one of the lower traffic RTL wikis be included in the first window, for example he.wikinews or he.wikisource?
[ /me Running to ask the he.wikisource community whether they will actually agree to this. :) ]
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com "We're living in pieces, I want to live in peace." - T. Moore
On Thu, Feb 10, 2011 at 3:29 PM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
2011/2/11 Rob Lanphier robla@wikimedia.org:
First window This first deployment window will be to a limited set of wikis:
http://simple.wikipedia.org/ (simplewiki) http://simple.wiktionary.org/ (simplewiktionary) http://usability.wikimedia.org/ (usabilitywiki)
<snip>
How were these selected?
None of them is RTL (right-to-left). I would very much like to test it in an actual RTL wiki, not just a prototype. During the first 1.17 upgrade attempt there were some minor RTL issues in he.wikipedia. Not terrible, but it would be nice to prevent them.
Can one of the lower traffic RTL wikis be included in the first window, for example he.wikinews or he.wikisource?
[ /me Running to ask the he.wikisource community whether they will actually agree to this. :) ]
Hi Amir,
Sorry about not responding quickly to this. I think this is something we can consider if the deployment is going well, and the community is ready for a change. Also, we'd need you (or someone) to be an ambassador for this and willing to be on IRC with us (#wikimedia-tech). Is he.wikisource.org ready?
Our choice of wikis was based on our confidence in the ability to communicate with the various communities if we were having problems, hence our heavy bias toward English-language sites (plus Dutch since we've got at least a couple people on the team that speak Dutch, and that gives us a moderately large wiki to deploy to).
Rob
2011/2/11 Rob Lanphier robla@wikimedia.org:
Sorry about not responding quickly to this. I think this is something we can consider if the deployment is going well, and the community is ready for a change. Also, we'd need you (or someone) to be an ambassador for this and willing to be on IRC with us (#wikimedia-tech). Is he.wikisource.org ready?
We can always just change the language on test2.wikipedia.org to Hebrew or something.
Roan Kattouw (Catrope)
wikitech-l@lists.wikimedia.org