[Labs-announce] [Labs-l] Follow up + Incident report - Tools NFS space conservation (round 4)

Madhumitha Viswanathan mviswanathan at wikimedia.org
Fri Mar 10 23:34:06 UTC 2017

Hi all,

There was some unexpected data deletion that happened during this space
conservation that deleted files over 100M even though they weren't
log/err/out files, caused by me mistakenly truncating files from a list
that was broader than intended.

Almost all of this tools data(except for the originally indented to be
truncated log/err/out files) has been restored in place at this point from
our backups from the secondary datacenter. The backup was 1 day (~23 hours)
old, so new data that was written in the time is lost.

There is an incident report here that has all details -
Few files that were generated in the 23 hour period between the last backup
and the deletion that couldn't be restored are here -

Apologies for the inconvenience, we are working on making this process more
so we can avoid incidents such as these due to manual error. Feel free to
reach out to us at #wikimedia-labs for questions/concerns.

---------- Forwarded message ----------
From: Madhumitha Viswanathan <mviswanathan at wikimedia.org>
Date: Tue, Mar 7, 2017 at 1:11 PM
Subject: [Labs-l] [Labs-announce] Tools NFS space conservation (round 4)
To: labs-announce at lists.wikimedia.org

The last Tools NFS storage cleanup was in October 2016 and we are there
again. I am going to empty log, err, and out files greater than 100M, and
will start the cleanup tomorrow  - Wednesday, March 8, 2017.

Feel free to reach out to us on #wikimedia-labs for questions or concerns.


Madhu Viswanathan,
Operations Engineer, Wikimedia Labs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-announce/attachments/20170310/7513d941/attachment.html>

More information about the Labs-announce mailing list