Hi Erik,
No rush. I'm glad to establish communications with another branch of the Wikiverse. If
I'm just browsing arbitrary categories by traffic, I have some code I run that is
similar to the TreeView explorer hosted on the tool server. Mine just has a few extra
features I need for manually merging subcategories. For example, the cuisine categories
aren't as consistently structured as the film categories. That code polls each page
manually from the stats.groke.se API though, which can get pretty slow for large
subcategories (like films). So when I'm exploring trends in media like films, books,
albums, songs, video games, TV series, etc. I have some separate code. It grabs the raw
hourly files for the past 2-3 days, makes a hash table of all of the articles from the
large subcategories I check frequently using the fast MediaWiki API, and then scans the
hourly files once filling in the hash table traffic info. Then I saw your compressed
monthly summary files and figured that would be even faster than downloading 2-3 days of
the hourly files. I'll keep an eye out.
From: ezachte(a)wikimedia.org
To: analytics(a)lists.wikimedia.org
Date: Thu, 19 Feb 2015 04:12:55 +0100
Subject: Re: [Analytics] Monthly compressed traffic delay
Hi Michael, Thanks for your offer, I appreciate it.I've been quite busy in recent
weeks , but haven't forgotten abouth these compressed dumps, and will look into it
soon (less than a week). Cheers,Erik From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Michael Hale
Sent: Wednesday, February 18, 2015 15:24
To: analytics(a)lists.wikimedia.org
Subject: [Analytics] Monthly compressed traffic delay Hello,
I'm inquiring about the delay for publishing the January compressed Wikistats files
that are maintained by Erik Zachte. I'm guessing those processes are given a low
priority compared to the content backups that need to run. More generally, I'm
interested in finding new ways that I can help out. I'm an ex-Microsoftie who is now
on the fraud analytics team at TD Bank. I've been involved with the Wikimedia group in
Atlanta. I organize the picnic each summer, and helped get the rest of the historic
buildings photographed. I've dabbled in reverting vandalism, and I contribute to
articles when I actually have something to contribute. I don't feel like I've
settled into a contributor role that really fits me yet though.
I enjoy using a variety of the traffic data sets that Wikimedia publishes. It seems the
traffic servers get bogged down sometimes though. Can I help? Should I try to get the
Atlanta group to pool our donations this year for an extra computer?
Thanks,
Michael
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics