Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
-
Local emergency backup on each instance. -
All instances will be *read-only during the switchover *to protect data integrity. -
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering*.
This operation will allow us to unblock a few things:
-
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470) -
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* (T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit * (T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
-
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period -
We’ll post start/end updates during the window.
*Links*
-
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over -
Phabricator: https://phabricator.wikimedia.org/T387833 -
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
Hello,
small reminder: we'll start in a few minutes
Thank you for your understanding
On Mon, Sep 29, 2025 at 3:48 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
Local emergency backup on each instance.
All instances will be *read-only during the switchover *to protect data integrity.
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering*.
This operation will allow us to unblock a few things:
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470)
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* (T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit *(T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period
We’ll post start/end updates during the window.
*Links*
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over
Phabricator: https://phabricator.wikimedia.org/T387833
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
We ran into issues with the cookbook / cumin before Gerrit instances were switched over. All the changes have now been reverted and we'll try again after investigating.
On Mon, Oct 6, 2025 at 1:53 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
small reminder: we'll start in a few minutes
Thank you for your understanding
On Mon, Sep 29, 2025 at 3:48 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
Local emergency backup on each instance.
All instances will be *read-only during the switchover *to protect data integrity.
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering*.
This operation will allow us to unblock a few things:
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470)
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* (T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit *(T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period
We’ll post start/end updates during the window.
*Links*
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over
Phabricator: https://phabricator.wikimedia.org/T387833
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Hello,
As mentioned, we ran into issues with the cookbook due to a typo, which has been fixed https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1193860 . We plan to try again on Thursday, 12:00 UTC.
The item has been added to the deployment calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251009T1200
Thank you for your patience.
On Mon, Oct 6, 2025 at 2:47 PM Lukasz Sobanski lsobanski@wikimedia.org wrote:
We ran into issues with the cookbook / cumin before Gerrit instances were switched over. All the changes have now been reverted and we'll try again after investigating.
On Mon, Oct 6, 2025 at 1:53 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
small reminder: we'll start in a few minutes
Thank you for your understanding
On Mon, Sep 29, 2025 at 3:48 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
Local emergency backup on each instance.
All instances will be *read-only during the switchover *to protect data integrity.
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering*.
This operation will allow us to unblock a few things:
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470)
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* (T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit *(T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period
We’ll post start/end updates during the window.
*Links*
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over
Phabricator: https://phabricator.wikimedia.org/T387833
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
-- Lukasz _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
We ran into another issue with the process which has been identified https://wikimedia.slack.com/archives/C01R06P8D1B/p1760015457631469?thread_ts=1759734023.979729&cid=C01R06P8D1B and will be remediated soon, All the changes have now been reverted and we'll try again next week.
Le mar. 7 oct. 2025, 11:19, Arnaud Bran abran@wikimedia.org a écrit :
Hello,
As mentioned, we ran into issues with the cookbook due to a typo, which has been fixed https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1193860. We plan to try again on Thursday, 12:00 UTC.
The item has been added to the deployment calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251009T1200
Thank you for your patience.
On Mon, Oct 6, 2025 at 2:47 PM Lukasz Sobanski lsobanski@wikimedia.org wrote:
We ran into issues with the cookbook / cumin before Gerrit instances were switched over. All the changes have now been reverted and we'll try again after investigating.
On Mon, Oct 6, 2025 at 1:53 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
small reminder: we'll start in a few minutes
Thank you for your understanding
On Mon, Sep 29, 2025 at 3:48 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
Local emergency backup on each instance.
All instances will be *read-only during the switchover *to protect data integrity.
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering* .
This operation will allow us to unblock a few things:
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470)
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* ( T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit *(T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period
We’ll post start/end updates during the window.
*Links*
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over
Phabricator: https://phabricator.wikimedia.org/T387833
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
-- Lukasz _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Hello,
The issue we faced has been resolved https://phabricator.wikimedia.org/T387833#11267438, we'll be syncing a manageable amount of files and the local backup process has been simplified https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1194949 to ensure less downtime. We plan to run the next switchover tomorrow Oct. 13th at 9:00 UTC https://wikitech.wikimedia.org/w/index.php?title=Deployments&wvprov=sticky-header#deploycal-item-20251014T0900 .
Thank you all for your patience.
On Thu, Oct 9, 2025 at 4:20 PM Arnaud Bran abran@wikimedia.org wrote:
We ran into another issue with the process which has been identified https://wikimedia.slack.com/archives/C01R06P8D1B/p1760015457631469?thread_ts=1759734023.979729&cid=C01R06P8D1B and will be remediated soon, All the changes have now been reverted and we'll try again next week.
Le mar. 7 oct. 2025, 11:19, Arnaud Bran abran@wikimedia.org a écrit :
Hello,
As mentioned, we ran into issues with the cookbook due to a typo, which has been fixed https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1193860. We plan to try again on Thursday, 12:00 UTC.
The item has been added to the deployment calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251009T1200
Thank you for your patience.
On Mon, Oct 6, 2025 at 2:47 PM Lukasz Sobanski lsobanski@wikimedia.org wrote:
We ran into issues with the cookbook / cumin before Gerrit instances were switched over. All the changes have now been reverted and we'll try again after investigating.
On Mon, Oct 6, 2025 at 1:53 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
small reminder: we'll start in a few minutes
Thank you for your understanding
On Mon, Sep 29, 2025 at 3:48 PM Arnaud Bran abran@wikimedia.org wrote:
Hello,
We will perform maintenance on Gerrit on *Monday, October 6*, from *12:00 to 13:00 UTC*. During this window, we expect a *~20 minutes* write outage while we switch over Gerrit's service from the current primary (gerrit1003) to the spare host (gerrit2003). Since last time, we've added a few guardrails:
Local emergency backup on each instance.
All instances will be *read-only during the switchover *to protect data integrity.
Pre- and post-switch checks to verify everything is in the expected state.
As before, we’ll perform the switchover *live with Release Engineering*.
This operation will allow us to unblock a few things:
System daemon user: *gerrit2 → gerrit (*T338470 https://phabricator.wikimedia.org/T338470)
*OS upgrade* (T392464 https://phabricator.wikimedia.org/T392464, T384595 https://phabricator.wikimedia.org/T384595)*, Java* ( T392465 https://phabricator.wikimedia.org/T392465) version update and *Gerrit *(T379714 https://phabricator.wikimedia.org/T379714, T392448 https://phabricator.wikimedia.org/T392448) version update
*Expected impact*
A ~20 minutes *read-only* window; reads (browsing, cloning, reviewing) should remain fine. Pushes, votes, comments, etc. will be blocked during that period
We’ll post start/end updates during the window.
*Links*
Documentation: https://wikitech.wikimedia.org/wiki/Gerrit/Operations#Switch_over
Phabricator: https://phabricator.wikimedia.org/T387833
Deployment Calendar: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20251006T1200
Thank you for your understanding.
-- *Arnaud Bran* (he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
-- Lukasz _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
wikitech-l@lists.wikimedia.org