Dear users of stat100{4,6,7},
we are planning on upgrading stat1004 to Debian Buster this Thursday (2020-09-17) after 12:00 CEST (10:00 UTC). We will reinstall the machine, preserving user data (home directories, /srv), but to be on the safe side, we will backup that data. After the reinstall and a few tests, we will send an all-clear to this list.
A few things of note:
- It would be greatly appreciated if you cleaned out unneeded data before the backup time mentioned above, thus speeding up backup (and restore if we need it). - Any changes made to the file system contents after the time mentioned above may be lost. - Around the time of the backup, both cron and systemd timers will be disabled, and still-running process may be ungracefully terminated.
If this process works well, the remaining stat100x machines in need of update (6, 7) will be processed in a similar manner.
As always, if there are questions, do not hesitate to contact us.
Best, Tobias
Thanks Tobias for this work!
It would be greatly appreciated if you cleaned out unneeded data before
the backup time mentioned above, thus speeding up backup (and restore if we need it). FYI to anyone doing this, Luca put together a nice reference for cleaning up large files: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients#Checking_the_s...
On Tue, Sep 15, 2020 at 8:20 AM Tobias Klausmann tklausmann@wikimedia.org wrote:
Dear users of stat100{4,6,7},
we are planning on upgrading stat1004 to Debian Buster this Thursday (2020-09-17) after 12:00 CEST (10:00 UTC). We will reinstall the machine, preserving user data (home directories, /srv), but to be on the safe side, we will backup that data. After the reinstall and a few tests, we will send an all-clear to this list.
A few things of note:
- It would be greatly appreciated if you cleaned out unneeded data before
the backup time mentioned above, thus speeding up backup (and restore if we need it).
- Any changes made to the file system contents after the time mentioned
above may be lost.
- Around the time of the backup, both cron and systemd timers will be disabled, and still-running process may be ungracefully terminated.
If this process works well, the remaining stat100x machines in need of update (6, 7) will be processed in a similar manner.
As always, if there are questions, do not hesitate to contact us.
Best, Tobias
-- Tobias Klausmann, SRE, Wikimedia Foundation
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
Dear users of stat1004,
as it turns out, the backup of /srv will take a total of about eight hours or so. Thus, we will only do the backup step today (UTC/CEST), and the reimaging first thing tomorrow.
Apologies for this taking longer than expected. I'll send an update here as things progress.
Best, Tobias
On Tue, Sep 15, 2020 at 2:20 PM Tobias Klausmann tklausmann@wikimedia.org wrote:
Dear users of stat100{4,6,7},
we are planning on upgrading stat1004 to Debian Buster this Thursday (2020-09-17) after 12:00 CEST (10:00 UTC). We will reinstall the machine, preserving user data (home directories, /srv), but to be on the safe side, we will backup that data. After the reinstall and a few tests, we will send an all-clear to this list.
A few things of note:
- It would be greatly appreciated if you cleaned out unneeded data before
the backup time mentioned above, thus speeding up backup (and restore if we need it).
- Any changes made to the file system contents after the time mentioned
above may be lost.
- Around the time of the backup, both cron and systemd timers will be disabled, and still-running process may be ungracefully terminated.
If this process works well, the remaining stat100x machines in need of update (6, 7) will be processed in a similar manner.
As always, if there are questions, do not hesitate to contact us.
Best, Tobias
-- Tobias Klausmann, SRE, Wikimedia Foundation
Hi everyone,
stat1004 has been reimaged (with /srv preserved), and is back in operation. It should work just as before, but with newer versions of kernel and system software.
Luca has prepared a document that explains how to reset virtual envs if needed:
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#Resetting_user...
If there is anything out of place, just let us know.
The updates to stat1005 and stat1007 will happen next week at the earliest, preceded by separate announcements at least 24h before the individual reinstalls.
Best, Tobias
On Thu, Sep 17, 2020 at 1:11 PM Tobias Klausmann tklausmann@wikimedia.org wrote:
Dear users of stat1004,
as it turns out, the backup of /srv will take a total of about eight hours or so. Thus, we will only do the backup step today (UTC/CEST), and the reimaging first thing tomorrow.
Apologies for this taking longer than expected. I'll send an update here as things progress.
Best, Tobias
On Tue, Sep 15, 2020 at 2:20 PM Tobias Klausmann tklausmann@wikimedia.org wrote:
Dear users of stat100{4,6,7},
we are planning on upgrading stat1004 to Debian Buster this Thursday (2020-09-17) after 12:00 CEST (10:00 UTC). We will reinstall the machine, preserving user data (home directories, /srv), but to be on the safe side, we will backup that data. After the reinstall and a few tests, we will send an all-clear to this list.
A few things of note:
- It would be greatly appreciated if you cleaned out unneeded data before
the backup time mentioned above, thus speeding up backup (and restore if we need it).
- Any changes made to the file system contents after the time mentioned
above may be lost.
- Around the time of the backup, both cron and systemd timers will be disabled, and still-running process may be ungracefully terminated.
If this process works well, the remaining stat100x machines in need of update (6, 7) will be processed in a similar manner.
As always, if there are questions, do not hesitate to contact us.
Best, Tobias
-- Tobias Klausmann, SRE, Wikimedia Foundation
-- Tobias Klausmann, SRE, Wikimedia Foundation
Hello everyone,
this is a 48h+ warning that stat1006 and stat1007 will be reinstalled this week (preserving /srv and user data).
The backup of stat1006 will happen on Wednesday around 0900 UTC. Backup will likely take in excess of eight hours, so the reinstall of the machine will probably only happen on Thursday morning (this is much quicker, less than an hour probably).
On Friday, again at around 0900 UTC, we will backup stat1007, and proceed in the same way. I will send another reminder 24h before this happens.
Both machines should be 100% back in service by Monday 1200 UTC.
If you want to help with making the backups (and potential restores) quicker, Luca and I have provided some tips here:
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients#Checking_the_s...
As always, don't hesitate to contact us with questions.
Best, Tobias
Hi!
I have just reimaged stat1006. The machine is back in a productive state, and we didn't need to do a restore of /srv.
The SSH host keys have changed. I have updated the corresponding wiki page[0] with the new fingerprints. You can also use the utility script[1] to update the keys automagically.
As always, if there is anything out of order, don't hesitate to contact us.
Note that tomorrow morning EU time, I will start the backup of stat1007, and will reimage it on Monday morning.
Best, Tobias
[0] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/stat1006.eqiad.wmn... [1] https://wikitech.wikimedia.org/wiki/Production_access#Known_host_files
Hi!
I have just reimaged stat1007. The machine is back in a productive state, and we didn't need to do a restore of /srv.
The SSH host keys have changed. I have updated the corresponding wiki page[0] with the new fingerprints. You can also use the utility script[1] to update the keys automagically.
As always, if there is anything out of order, don't hesitate to contact us.
This concludes the updates of stat100{4,6,7} to Debian Buster (T255028).
Best, Tobias
[0] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/stat1007.eqiad.wmn... [1] https://wikitech.wikimedia.org/wiki/Production_access#Known_host_files
Hi!
As it turns out, there was an unfortunate mistake when reinstalling the three machines mentioned in the subject. As a result, we have to do the imaging again. Fortunately, we can skip the backup steps this time around, so the disruption will be much smaller.
This time, we will start with stat1007, next week Tuesday (2020-10-06), in the UTC/EU morning. Dowtime should be less than two hours.
If you have any questions, let us know.
Best, Tobias
Hi!
I have just reimaged stat1007 (actually to Debian Buster this time). The machine is back in a productive state, and we didn't need to do a restore of /srv.
The SSH host keys have changed. I have updated the corresponding wiki page[0] with the new fingerprints. You can also use the utility script[1] to update the keys automagically.
As always, if there is anything out of order, don't hesitate to contact us.
Best, Tobias
[0] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/stat1007.eqiad.wmn... [1] https://wikitech.wikimedia.org/wiki/Production_access#Known_host_files
Hi!
Apologies for the announcement spam!
With stat1007 now imaged to Buster, we will proceed with stat1005 this Thursday (2020-10-08), and stat 1004 next week (Tuesday, 2020-10-13) in the same manner.
Best, Tobias
Hi!
As of a few moments ago, stat1006 is now on Debian Buster.
The SSH host keys have changed. I have updated the corresponding wiki page[0] with the new fingerprints. You can also use the utility script[1] to update the keys automagically.
The last reimage, of stat1004, will happen Tuesday next week (2020-10-13), during the EU/CEST morning.
As always, if there is anything out of order, don't hesitate to contact us.
Best, Tobias
[0] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/stat1006.eqiad.wmn... [1] https://wikitech.wikimedia.org/wiki/Production_access#Known_host_files
Hi!
As of a few moments ago, stat1004 is now on Debian Buster.
The SSH host keys have changed. I have updated the corresponding wiki page[0] with the new fingerprints. You can also use the utility script[1] to update the keys automagically.
This concludes the move of these three hosts to Debian Buster.
As always, if there is anything out of order, don't hesitate to contact us.
Best, Tobias
[0] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/stat1004.eqiad.wmn... [1] https://wikitech.wikimedia.org/wiki/Production_access#Known_host_files
analytics-announce@lists.wikimedia.org