FYI, the test has started. We are in the process of
switching the traffic to
the Dallas DC.
Cheers,
Marko
On 14 March 2016 at 22:54, Marko Obrovac <mobrovac(a)wikimedia.org> wrote:
Hello,
The WMF’s technology department has for this quarter the goal of testing
and temporarily switching the main operational data centre from Eqiad
(located in Chicago) to Codfw (located in Dallas)~[1,2]. This includes both
back-end-processing as well as serving live traffic from it.
As a part of this effort, we are scheduling a switch-over for RESTBase and
its back-end services, including: Parsoid, the Mobile Content Service,
CXServer, Mathoid, Citoid, Apertium and Zotero~[3]. Technically, it will not
be a real switch-over per se, because we will keep all of those services
active in both DCs. However, external traffic will be directed to the Dallas
DC only.
=== When is it and what does it mean for me? ===
The switch-over test is planned for this Thursday, 2016-03-17. We have
allotted a three-hour window for this~[4]. There is nothing users should do
before or after the switch; it will be transparent for them. There are two
things users should note, though:
1) At the time of the switch-over, users might receive error responses for
a while (both 4xx and 5xx status codes). While we will test most of the
things ahead of time, we cannot test the actual traffic shifting, so small
bumps might be noticed.
2) After the switch to the Dallas DC, users will likely see their response
latencies slightly elevated. During the test, some requests might experience
a slightly larger latency. This will occur because all of the services that
will be responding to live requests still need to contact the main MediaWiki
cluster, which will remain in Eqiad (the other DC) until a complete
switch-over of the infrastructure is performed. However, given the multiple
levels of caching, the 40 ms of penalty to go cross-DC for an uncached API
request does not seem too taxing.
=== Wait, what about my service X running in WMF production? ===
If you are a service owner of one the aforementioned services, there are
no explicit actions you should take prior to, during or after the
switch-over test. This test could, however, affect your service depending on
whether it usually serves live traffic or is mostly operational during
various internal updates. MediaWiki and JobQueue processing will still be
performed in Eqiad, so in the latter case your service should not see a
change in the usage pattern. If, however, your service is mostly in charge
of responding to live requests coming through RESTBase, those will be
handled by instances in Codfw. However, as these services are full replicas
of their Eqiad counterparts and are stateless, no major breakage will
happen.
Should you have any questions or concerns, don’t hesitate to contact us
here or on IRC (#wikimedia-services @ freenode).
Best,
Marko Obrovac, PhD
Senior Services Engineer
Wikimedia Foundation
[1]
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Techn…
[2]
https://phabricator.wikimedia.org/project/profile/1723/
[3]
https://phabricator.wikimedia.org/T127974
[4]
https://wikitech.wikimedia.org/wiki/Deployments#Thursday.2C.C2.A0March.C2.A…
--
Marko Obrovac, PhD
Senior Services Engineer
Wikimedia Foundation
_______________________________________________
Services mailing list
Services(a)lists.wikimedia.org