The consumer that writes Event Logging events into the SQL database was down yesterday for about 9 hours. We restarted it and it consumed the data it missed, and inserted it into the database. The incident report is here:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20151015-EventLog...
We don't yet know if any data was lost, I'm going to run some queries now on a few schemas and I'll update the incident report.
If you had reports run on October 14th, between 06:00 UTC and 21:00 UTC, you should re-run them.
quick follow up: it looks like the majority of data was not recovered, so re-running your reports won't help. We'll try to recover what's missing if we still have it in Kafka. Let us know if you need to be kept up to date about this.
On Thu, Oct 15, 2015 at 4:31 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
The consumer that writes Event Logging events into the SQL database was down yesterday for about 9 hours. We restarted it and it consumed the data it missed, and inserted it into the database. The incident report is here:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20151015-EventLog...
We don't yet know if any data was lost, I'm going to run some queries now on a few schemas and I'll update the incident report.
If you had reports run on October 14th, between 06:00 UTC and 21:00 UTC, you should re-run them.
Yestersday, we finally completed the backfilling of the affected time range. In the end we managed to get the missing data from the archived logs. So, please, re-run any reports for October 14th, between 06:00 UTC and 21:00 UTC. Thank you, and apologies for any inconvenience.
On Thu, Oct 15, 2015 at 11:07 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
quick follow up: it looks like the majority of data was not recovered, so re-running your reports won't help. We'll try to recover what's missing if we still have it in Kafka. Let us know if you need to be kept up to date about this.
On Thu, Oct 15, 2015 at 4:31 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
The consumer that writes Event Logging events into the SQL database was down yesterday for about 9 hours. We restarted it and it consumed the data it missed, and inserted it into the database. The incident report is here:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20151015-EventLog...
We don't yet know if any data was lost, I'm going to run some queries now on a few schemas and I'll update the incident report.
If you had reports run on October 14th, between 06:00 UTC and 21:00 UTC, you should re-run them.
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics