2011/2/1 Rob Lanphier <robla(a)robla.net>et>:
Can you
explain why you're rolling out when it's the middle of the night
where Wikimedia is headquartered? I have a few different theories (site
traffic, time zones of the operations team, etc.), but a clarification here
would be good.
We lost the game of rock/paper/scissors. :) We decided to do this very
late U.S. west coast time so that our European and Australian contingents
would be well rested in case there are problems. Given that we have key
personnel pretty much all over the globe, there wasn't going to be a great
time for this, and this has the added advantage of being a relatively low
traffic time for us.
Look at
http://torrus.wikimedia.org/torrus/CDN?path=%2FTotals%2F and
you'll see that, for the past two days, the time of lowest traffic was
between 06:00 and 07:00 UTC. This has been a quite reliable pattern
for quite some time now (except that it shifts by an hour in Northern
Hemisphere summer, due to DST), and we've also used this time for the
first few Vector deployments.
It'll be an annoying time for all of us. Europe-based people will have
to get up relatively early ("engineer early", in RobLa's words), it'll
be 1am and 2am respectively for our US-based operations people
(despite WMF being headquartered in SF, we currently have no ops
engineers there, although of course other SF people will be involved
and they'll also be working in the middle of the night), and Tim will
most likely be eating dinner at his desk while possibly keeping the
site up.
Why is
prototype.wikimedia.org being used instead of test.wikipedia.org? I
was under the impression that the purpose of
test.wikipedia.org was a
pre-deployment launch pad while
prototype.wikimedia.org is used for
testing
new extensions/features. Has this changed?
Yeah it has. I don't recall the exact history of how we got to this point.
I imagine that the two should become one in the future.
The fundamental difference between test and prototype is that test is
part of the cluster, and prototype is separated from it. It's
undesirable and impractical to run experimental code on test for this
reason, so it's only used as a quick last check for code that's going
to deployed soon (say, within the next hour). Due to the way our
deployment infrastructure works right now, it's impractical keeping
undeployed code around on test, because it can be hard or even
impossible for the next person needing to deploy a small change to the
rest of the cluster to avoid deploying the test code cluster-wide.
In the future, we'll have a virtualization cluster for testing
experimental code, which will be integrated with the cluster more
closely than prototype is (mostly in terms of configuration
synchronization, but also because it'll run on our machines rather
than on a Linode VM) while maintaining a safe level of separation.
Ryan Lane can probably talk about this in more detail. Also, there are
plans to adjust our infrastructure to support heterogeneous
deployments (different versions of the code on different wikis).
Roan Kattouw (Catrope)