Thanks for the heads-up, Michael.
It prompted me to watch again your initial demo with spoken comments.
https://www.youtube.com/watch?v=f3QXwY-XR28
I don't have Mathematica, so I can't run your script, but it certainly seems
fun to play with!
Cheers,
Erik
From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Michael Hale
Sent: Thursday, March 19, 2015 16:53
To: A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.
Subject: Re: [Analytics] Monthly compressed traffic delay
Thanks again for fixing those, Erik. In case you or the others want to see
how much the monthly files improved the performance of my local category
browser, I've linked to a short GIF animation. The old version polled the
stats.grok.se server and would often only get a single page result about
every 3 seconds, so it's a huge speedup.
http://i.stack.imgur.com/9Yjjx.gif
_____
From: hale.michael.jr(a)live.com
To: analytics(a)lists.wikimedia.org
Date: Tue, 24 Feb 2015 17:28:51 -0500
Subject: Re: [Analytics] Monthly compressed traffic delay
Thanks, Erik. I actually noticed the empty title records in the hourly files
recently too. I didn't make the connection that it could have been the
culprit though. To give an example of one type of output I make, here are
the most popular articles for different media types from a 3 day span from
yesterday. Your compressed files will definitely open up some new scenarios
though.
https://docs.google.com/spreadsheets/d/19IoFHy-U0JInOzi32_iemTXcEmGudeK-jXUD
pp5m0UE/edit?usp=sharing
_____
From: ezachte(a)wikimedia.org
To: analytics(a)lists.wikimedia.org
Date: Tue, 24 Feb 2015 23:09:53 +0100
Subject: Re: [Analytics] Monthly compressed traffic delay
Michael, a quick heads-up:
So I finally found the time to look into this.
Sorry that it took so long.
https://phabricator.wikimedia.org/T90230
Bug has been analyzed and fixed.
The underlying problem is a record in an hourly pageview dump with empty
title. My script now patches such records with title '-no-title-'.
I filed a separate bug for that:
https://phabricator.wikimedia.org/T90629
Daily aggregation has been restarted and successfully processed data for Jan
27. Now it will take a day or two to catch up.
Cheers,
Erik
From: Erik Zachte [mailto:ezachte@wikimedia.org]
Sent: Thursday, February 19, 2015 4:13
To: 'A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.'
Subject: RE: [Analytics] Monthly compressed traffic delay
Hi Michael,
Thanks for your offer, I appreciate it.
I've been quite busy in recent weeks , but haven't forgotten abouth these
compressed dumps, and will look into it soon (less than a week).
Cheers,
Erik
From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Michael Hale
Sent: Wednesday, February 18, 2015 15:24
To: analytics(a)lists.wikimedia.org
Subject: [Analytics] Monthly compressed traffic delay
Hello,
I'm inquiring about the delay for publishing the January compressed
Wikistats files that are maintained by Erik Zachte. I'm guessing those
processes are given a low priority compared to the content backups that need
to run. More generally, I'm interested in finding new ways that I can help
out. I'm an ex-Microsoftie who is now on the fraud analytics team at TD
Bank. I've been involved with the Wikimedia group in Atlanta. I organize the
picnic each summer, and helped get the rest of the historic buildings
photographed. I've dabbled in reverting vandalism, and I contribute to
articles when I actually have something to contribute. I don't feel like
I've settled into a contributor role that really fits me yet though.
I enjoy using a variety of the traffic data sets that Wikimedia publishes.
It seems the traffic servers get bogged down sometimes though. Can I help?
Should I try to get the Atlanta group to pool our donations this year for an
extra computer?
Thanks,
Michael
_______________________________________________ Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________ Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics