The NFS servers used for scratch and maps mounts (/data/project and /home in the maps project and /data/scratch in other projects) will be going offline for a short time tomorrow 2021-07-01 at around 1600 UTC to move the mounts to DRBD synced volumes. The current setup causes odd issues during failover including data loss and stale files left behind. The process taking place is one of those failovers so there may be some files that were previously deleted that need deleting again present and similar anomalies.
I plan to reboot the maps project servers to make sure they have their mounts and processes restored as best as possible. The scratch mounts should be less impactful. If you use scratch, just be aware that it will go offline for a bit and will be back with some possible quirks. After that, the data should become far more stable and properly synced between the two systems. The process could start later than 1600 UTC if there are sync issues initially as I try to get as much of the data as possible transferred.
More details here https://phabricator.wikimedia.org/T224747 https://phabricator.wikimedia.org/T224747
Brooke Storm Staff SRE Wikimedia Cloud Services bstorm@wikimedia.org
This is starting.
Brooke Storm Staff SRE Wikimedia Cloud Services bstorm@wikimedia.org
On Jun 30, 2021, at 9:34 AM, Brooke Storm bstorm@wikimedia.org wrote:
The NFS servers used for scratch and maps mounts (/data/project and /home in the maps project and /data/scratch in other projects) will be going offline for a short time tomorrow 2021-07-01 at around 1600 UTC to move the mounts to DRBD synced volumes. The current setup causes odd issues during failover including data loss and stale files left behind. The process taking place is one of those failovers so there may be some files that were previously deleted that need deleting again present and similar anomalies.
I plan to reboot the maps project servers to make sure they have their mounts and processes restored as best as possible. The scratch mounts should be less impactful. If you use scratch, just be aware that it will go offline for a bit and will be back with some possible quirks. After that, the data should become far more stable and properly synced between the two systems. The process could start later than 1600 UTC if there are sync issues initially as I try to get as much of the data as possible transferred.
More details here https://phabricator.wikimedia.org/T224747 https://phabricator.wikimedia.org/T224747
Brooke Storm Staff SRE Wikimedia Cloud Services bstorm@wikimedia.org mailto:bstorm@wikimedia.org
The user-facing portion of this should be complete now.
On Jul 1, 2021, at 9:07 AM, Brooke Storm bstorm@wikimedia.org wrote:
This is starting.
Brooke Storm Staff SRE Wikimedia Cloud Services bstorm@wikimedia.org mailto:bstorm@wikimedia.org
On Jun 30, 2021, at 9:34 AM, Brooke Storm <bstorm@wikimedia.org mailto:bstorm@wikimedia.org> wrote:
The NFS servers used for scratch and maps mounts (/data/project and /home in the maps project and /data/scratch in other projects) will be going offline for a short time tomorrow 2021-07-01 at around 1600 UTC to move the mounts to DRBD synced volumes. The current setup causes odd issues during failover including data loss and stale files left behind. The process taking place is one of those failovers so there may be some files that were previously deleted that need deleting again present and similar anomalies.
I plan to reboot the maps project servers to make sure they have their mounts and processes restored as best as possible. The scratch mounts should be less impactful. If you use scratch, just be aware that it will go offline for a bit and will be back with some possible quirks. After that, the data should become far more stable and properly synced between the two systems. The process could start later than 1600 UTC if there are sync issues initially as I try to get as much of the data as possible transferred.
More details here https://phabricator.wikimedia.org/T224747 https://phabricator.wikimedia.org/T224747
Brooke Storm Staff SRE Wikimedia Cloud Services bstorm@wikimedia.org mailto:bstorm@wikimedia.org
cloud-announce@lists.wikimedia.org