Hi all,

Christian -- thanks for following up on this. 

I've created a ticket[1] for this issue as a production issue. Kevin -- please triage tomorrow in standup. We can own the actual incident report but we'll need to get some help from Ori in understanding how to perform the post mortem.

The current status for EventLogging support is that Ori, the Analytics team, the Operations team and the Platform teams are discussing the handover of EventLogging. The Analytics team will own EventLogging as soon as we can, but we need to get consensus on the details.

I've written up our discussions on this wiki page[2]. Please feel free to add/discuss. We've had some preliminary discussions with Andrew Otto but need to follow up with Rob and Ori.

-Toby

[1] https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/1526
[1] https://www.mediawiki.org/wiki/Analytics/EventLogging


On Thu, Apr 3, 2014 at 6:27 AM, Christian Aistleitner <christian@quelltextlich.at> wrote:
Hi Toby,

and zooooooooom ... there goes another week without us even deciding
whether or not we feel responsible doing the incident documentation
and follow-up work. :-D

I feel somewhat embarrassed that after two weeks, and after the ping
on mailing lists, we still did not yet manage to tell Greg at least
whether or not we'll work on it.

So,—if you do not chime in/push back by then—I'll be bold and I'll
consider our given lip service around EventLogging a commitment and
start working on it on Monday (2014-04-07).

Best regards,
Christian



On Thu, Mar 27, 2014 at 06:58:27PM +0100, Christian Aistleitner wrote:
> Hi Analytics Dev team,
>
> On Thu, Mar 20, 2014 at 01:20:54PM -0700, Greg Grossmeier wrote:
> > <quote name="Ori Livneh" date="2014-03-20" time="03:52:01 -0700">
> > > [ At about 2014-03-18 00:04 UTC, db1047 stopped accepting incoming
> > > connections. At some point during the subsequent hour, MariaDB had either
> > > crashed or been manually restarted. Sean noticed that the database was
> > > choking on some queries from the researchers and notified the wmfresearch
> > > list.
> >
> > Can someone from Analytics own this post-mortem and put it on the wiki:
> > https://wikitech.wikimedia.org/wiki/Incident_documentation
> >
> > Please add specific next steps (with bug#, RT#s, or gerrit urls), even
> > (especially) things you haven't done yet and are just "nice to have".
>
> it's been a week, and I cannot find the post-mortem Greg requested at
> the above URL :-/
>
> Neither did I see a response from our team to Greg's email.
>
> I lost track of our EventLogging responsibilities during the recent
> back and forth. So:
>
> Toby, are we actually grabbing Greg's item or are we pushing back on
> it?
>
> Best regards,
> Christian
>
> P.S.: Toby, if we're grabbing it: I totally lack knowledge about both
> EventLogging, and the incident itself. So, be prepared for double slow
> start if I get to work on it.
>
>
>
> --
> ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
>                            Companies' registry: 360296y in Linz
> Christian Aistleitner
> Gruendbergstrasze 65a        Email:  christian@quelltextlich.at
> 4040 Linz, Austria           Phone:          +43 732 / 26 95 63
>                              Fax:            +43 732 / 26 95 63
>                              Homepage: http://quelltextlich.at/
> ---------------------------------------------------------------

--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
                           Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a        Email:  christian@quelltextlich.at
4040 Linz, Austria           Phone:          +43 732 / 26 95 63
                             Fax:            +43 732 / 26 95 63
                             Homepage: http://quelltextlich.at/
---------------------------------------------------------------