Michael, a quick heads-up:
So I finally found the time to look into this.
Sorry that it took so long.
Bug has been analyzed and fixed.
The underlying problem is a record in an hourly pageview dump with empty
title. My script now patches such records with title '-no-title-'.
I filed a separate bug for that: https://phabricator.wikimedia.org/T90629
Daily aggregation has been restarted and successfully processed data for Jan
27. Now it will take a day or two to catch up.
From: Erik Zachte [mailto:firstname.lastname@example.org]
Sent: Thursday, February 19, 2015 4:13
To: 'A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.'
Subject: RE: [Analytics] Monthly compressed traffic delay
Thanks for your offer, I appreciate it.
I've been quite busy in recent weeks , but haven't forgotten abouth these
compressed dumps, and will look into it soon (less than a week).
[mailto:email@example.com] On Behalf Of Michael Hale
Sent: Wednesday, February 18, 2015 15:24
Subject: [Analytics] Monthly compressed traffic delay
I'm inquiring about the delay for publishing the January compressed
Wikistats files that are maintained by Erik Zachte. I'm guessing those
processes are given a low priority compared to the content backups that need
to run. More generally, I'm interested in finding new ways that I can help
out. I'm an ex-Microsoftie who is now on the fraud analytics team at TD
Bank. I've been involved with the Wikimedia group in Atlanta. I organize the
picnic each summer, and helped get the rest of the historic buildings
photographed. I've dabbled in reverting vandalism, and I contribute to
articles when I actually have something to contribute. I don't feel like
I've settled into a contributor role that really fits me yet though.
I enjoy using a variety of the traffic data sets that Wikimedia publishes.
It seems the traffic servers get bogged down sometimes though. Can I help?
Should I try to get the Atlanta group to pool our donations this year for an