Hi Analytics dev team,
just a heads up that it's a week that the pagecounts-all-sites (and pagecounts-raw) did not have the 20150409-160000 file generated [1].
To ease data quality assurances and avoid faulty aggregates, the pageview aggregator scripts that do the aggregation for dashiki's “Reader / Daily Pageviews” block for a week on missing data (unless they are being told that for a given day, missing data is ok).
For the above hourly pagecounts-all-sites file, this week of blocking has now passed without action.
Hence, the aggregator scripts will start aggregating again (to some degree), but the undeclared hole for the 2015-04-09 in the data will naturally start to bubble up.
If that hour's file cannot get generated, adding this date to the BAD_DATES.csv of the aggregator data repository, will unblock the aggregator cron job and make weekly, monthly, aggregates consider 2015-04-09 as day without data.
If that hour's file gets generated, be aware that aggregator per default only automatically backfills for a week. So from today on, you need to explicitly run the script to backfill for 2015-04-09.
Have fun, Christian
P.S.: Since I guess the question of monitoring will arise ... the missing pagecounts file has alerted people at least twice by email. The subsequent aggregator blocking has been logged. But you can add yourself in the MAILTO of the aggregator cron at modules/statistics/manifests/aggregator.pp in puppet, if you want an additional notification for that.
[1] http://dumps.wikimedia.org/other/pagecounts-all-sites/2015/2015-04/ http://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-04/