It appears the monthly compressed traffic file generation has stopped again. It looks like
the daily compressed files stopped generating on May 9th. Can we restart this process? I
haven't looked at the hourly files recently. Perhaps there was another format change
that caused it to crash.
From: ezachte(a)wikimedia.org
To: analytics(a)lists.wikimedia.org
Date: Tue, 24 Feb 2015 23:09:53 +0100
Subject: Re: [Analytics] Monthly compressed traffic delay
Michael, a quick heads-up: So I finally found the time to look into this.Sorry that it
took so
long.https://phabricator.wikimedia.org/T90230Bug has been analyzed and fixed. The
underlying problem is a record in an hourly pageview dump with empty title. My script now
patches such records with title '-no-title-'.I filed a separate bug for that:
https://phabricator.wikimedia.org/T90629 Daily aggregation has been restarted and
successfully processed data for Jan 27. Now it will take a day or two to catch up.
Cheers,Erik From: Erik Zachte [mailto:ezachte@wikimedia.org]
Sent: Thursday, February 19, 2015 4:13
To: 'A mailing list for the Analytics Team at WMF and everybody who has an interest in
Wikipedia and analytics.'
Subject: RE: [Analytics] Monthly compressed traffic delay Hi Michael, Thanks for your
offer, I appreciate it.I've been quite busy in recent weeks , but haven't
forgotten abouth these compressed dumps, and will look into it soon (less than a week).
Cheers,Erik From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Michael Hale
Sent: Wednesday, February 18, 2015 15:24
To: analytics(a)lists.wikimedia.org
Subject: [Analytics] Monthly compressed traffic delay Hello,
I'm inquiring about the delay for publishing the January compressed Wikistats files
that are maintained by Erik Zachte. I'm guessing those processes are given a low
priority compared to the content backups that need to run. More generally, I'm
interested in finding new ways that I can help out. I'm an ex-Microsoftie who is now
on the fraud analytics team at TD Bank. I've been involved with the Wikimedia group in
Atlanta. I organize the picnic each summer, and helped get the rest of the historic
buildings photographed. I've dabbled in reverting vandalism, and I contribute to
articles when I actually have something to contribute. I don't feel like I've
settled into a contributor role that really fits me yet though.
I enjoy using a variety of the traffic data sets that Wikimedia publishes. It seems the
traffic servers get bogged down sometimes though. Can I help? Should I try to get the
Atlanta group to pool our donations this year for an extra computer?
Thanks,
Michael
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics