Hey all,
So, the data for the Search dashboards (http://searchdata.wmflabs.org/metrics/) comes from a variety of sources, one of which is the daily logs of all Cirrus search requests - about 46GB of data a day. We set up a pipeline to this to report the "zero" rate - how many queries happen with zero results. This was a pretty shaky pipeline because it was an ultra-urgent, need-it-for-a-presentation thing.
Good news: my prediction that it needed work was accurate. Bad news: my prediction that it needed work was accurate ;).
When Erik and I went through all of the scripts and rewrote them on the 15th we discovered a lot of maintenance tasks that were being identified as searches. These are now being excluded, but we have to backfill 1.5 months of data. I've chosen to eliminate the old data and then backfill, because it means we avoid having data from multiple, dissonant software versions, and because it just makes the backfilling task a bit easier.
As a result, the dashboards may look a bit odd over the next couple of days; they have data from the 15th onwards that we're comfortable about, but are gradually backfilling from 1 June to 14 July - starting on 1 June. So at the moment we have 1 June and 15-21 July. Weird. And then 1, 2nd June, 15th...so on.
So expect to see increasingly less weird graphs, until the point where they're back to normal, (but more consistent and sane looking). Until then: yeah, they're gonna look a bit weird.
Thanks,