Hi all,

We attempted a Kafka upgrade last[1] and this week[2], and during both occasions had incidents of webrequest data loss.  We are still resolving these, and still nailing down an estimate of how much data was lost and when.

One thing we do know: any webrequest_text related data since about 2016-08-11T16:00 is missing around (at least) 8% of data.  Camus is busy reimporting this missing data from Kafka since that time, and jobs that have been run since then will be rerun.  This includes pageview_hourly and any other webrequest related jobs.

We will document what we know about what data is really gone when we know more and also let you know when the refined webrequest data after 2016-08-11T16:00 is ready for use.

Really sorry for this inconvenience.  We are scrambling to get everything back in order.

-Andrew + Analytics Engineering Team

[1] https://wikitech.wikimedia.org/wiki/Incident_documentation/20150803-Kafka
[2] https://wikitech.wikimedia.org/wiki/Incident_documentation/20150810-Kafka