2011/2/1 Rob Lanphier robla@robla.net:
Can you explain why you're rolling out when it's the middle of the night where Wikimedia is headquartered? I have a few different theories (site traffic, time zones of the operations team, etc.), but a clarification here would be good.
We lost the game of rock/paper/scissors. :) We decided to do this very late U.S. west coast time so that our European and Australian contingents would be well rested in case there are problems. Given that we have key personnel pretty much all over the globe, there wasn't going to be a great time for this, and this has the added advantage of being a relatively low traffic time for us.
Look at http://torrus.wikimedia.org/torrus/CDN?path=%2FTotals%2F and you'll see that, for the past two days, the time of lowest traffic was between 06:00 and 07:00 UTC. This has been a quite reliable pattern for quite some time now (except that it shifts by an hour in Northern Hemisphere summer, due to DST), and we've also used this time for the first few Vector deployments.
It'll be an annoying time for all of us. Europe-based people will have to get up relatively early ("engineer early", in RobLa's words), it'll be 1am and 2am respectively for our US-based operations people (despite WMF being headquartered in SF, we currently have no ops engineers there, although of course other SF people will be involved and they'll also be working in the middle of the night), and Tim will most likely be eating dinner at his desk while possibly keeping the site up.
Why is prototype.wikimedia.org being used instead of test.wikipedia.org? I was under the impression that the purpose of test.wikipedia.org was a pre-deployment launch pad while prototype.wikimedia.org is used for testing new extensions/features. Has this changed?
Yeah it has. I don't recall the exact history of how we got to this point. I imagine that the two should become one in the future.
The fundamental difference between test and prototype is that test is part of the cluster, and prototype is separated from it. It's undesirable and impractical to run experimental code on test for this reason, so it's only used as a quick last check for code that's going to deployed soon (say, within the next hour). Due to the way our deployment infrastructure works right now, it's impractical keeping undeployed code around on test, because it can be hard or even impossible for the next person needing to deploy a small change to the rest of the cluster to avoid deploying the test code cluster-wide.
In the future, we'll have a virtualization cluster for testing experimental code, which will be integrated with the cluster more closely than prototype is (mostly in terms of configuration synchronization, but also because it'll run on our machines rather than on a Linode VM) while maintaining a safe level of separation. Ryan Lane can probably talk about this in more detail. Also, there are plans to adjust our infrastructure to support heterogeneous deployments (different versions of the code on different wikis).
Roan Kattouw (Catrope)