Hi Analytics,

We have found a problem that has been affecting the EventLogging data for one month. Since March 22, 2015 there have been several gaps (of around 1-2 hours length each) without data in all schema tables.

You can see the details in the following links:

Phabricator task:
https://phabricator.wikimedia.org/T96082
Incident documentation:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20150409-EventLogging
Technical discussion:
https://lists.wikimedia.org/pipermail/analytics/2015-April/003775.html

The problem still persists, although with less frequency due to sampling added to the Edit schema events, that have reduced EL throughput.

Next steps:
* The backfilling of the data gaps will be carried out this week.
* Implement a patch to avoid the problem ASAP.
* Implement a consistent solution to EventLogging scaling problems.
* Backfill any gaps that occur during implementation of the solutions.

Marcel