Hi,
I am sorry to announce yet another EventLogging outage :-(
EventLogging's database writer service failed to write events to the database between ~2014-11-25T03:09 and 2014-11-26T00:03.
The recent adhoc increase of EventLogging's database throughput capacity to address the EventLogging database writing bottle-neck came with the known issue of not too robust exit synchronization of threads within the EventLogging database writer. This exit synchronization issue got the database writer stuck and caused the outage.
A fix for that known issue is sitting in gerrit since Sunday, but due to the many meetings and discussions around recent EventLogging issues, the fix did not yet get reviewed and deployed.
The still rather empty Incident Report is at
https://wikitech.wikimedia.org/wiki/Incident_documentation/20141125-EventLog...
I'll fill it with more information tomorrow.
Backfilling from the logs is already running, and should also finish tomorrow.
Best regards, Christian
Hi,
On Wed, Nov 26, 2014 at 02:35:47AM +0100, Christian Aistleitner wrote:
The still rather empty Incident Report is at
https://wikitech.wikimedia.org/wiki/Incident_documentation/20141125-EventLog...
I'll fill it with more information tomorrow.
Done.
Backfilling from the logs is already running, and should also finish tomorrow.
Done.
The EventLogging-powered dashboards should show good numbers for 2014-11-25 after the next regeneration run (i.e. withing 24-hours for most dashboards).
Have fun, Christian
Thanks Christian,
I get a lot of value from these updates and I appreciate the work you put into making sure we know what's going on with EL data.
A fix for that known issue is sitting in gerrit since Sunday, but due to the many meetings and discussions around recent EventLogging issues, the fix did not yet get reviewed and deployed.
Scott Adams https://en.wikipedia.org/wiki/Scott_Adams would be proud https://en.wikipedia.org/wiki/Dilbert.
-Aaron
On Wed, Nov 26, 2014 at 5:41 AM, Christian Aistleitner < christian@quelltextlich.at> wrote:
Hi,
On Wed, Nov 26, 2014 at 02:35:47AM +0100, Christian Aistleitner wrote:
The still rather empty Incident Report is at
https://wikitech.wikimedia.org/wiki/Incident_documentation/20141125-EventLog...
I'll fill it with more information tomorrow.
Done.
Backfilling from the logs is already running, and should also finish tomorrow.
Done.
The EventLogging-powered dashboards should show good numbers for 2014-11-25 after the next regeneration run (i.e. withing 24-hours for most dashboards).
Have fun, Christian
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks for the update Christian.
Dan
On 25 November 2014 at 17:35, Christian Aistleitner < christian@quelltextlich.at> wrote:
Hi,
I am sorry to announce yet another EventLogging outage :-(
EventLogging's database writer service failed to write events to the database between ~2014-11-25T03:09 and 2014-11-26T00:03.
The recent adhoc increase of EventLogging's database throughput capacity to address the EventLogging database writing bottle-neck came with the known issue of not too robust exit synchronization of threads within the EventLogging database writer. This exit synchronization issue got the database writer stuck and caused the outage.
A fix for that known issue is sitting in gerrit since Sunday, but due to the many meetings and discussions around recent EventLogging issues, the fix did not yet get reviewed and deployed.
The still rather empty Incident Report is at
https://wikitech.wikimedia.org/wiki/Incident_documentation/20141125-EventLog...
I'll fill it with more information tomorrow.
Backfilling from the logs is already running, and should also finish tomorrow.
Best regards, Christian
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics