Hi,
in the week from 2014-10-06–2014-10-12 Andrew, Jeff, and I worked on the following items around the Analytics Cluster and Analytics related Ops:
* ULSFO outage affecting webrequest logs (Bug 71876, Bug 71879) * Revoked default Push grant for Analytics on gerrit's analytics/* projects * Wikimetrics showing many requests to internal files * Counting pageviews for the pages “undefined” / “Undefined” (Bug 66532) * Counting redirect pageviews for Webstatscollector (Bug 71790) * Reworking webstatscollector's build system * Puppetization of MaxMind's Connection Type databases * Wikihadoop now available on the Analytics Cluster * Analytics Mini-Hackathon in San Francisco (details below)
Have fun, Christian
* ULSFO outage affecting webrequest logs (Bug 71876, Bug 71879)
It seems there have been connection issues from ULSFO, which caused a minor hiccup in the webrequest logs on both udp2log and kafka [1]. Due to kafka's buffering, kafka could nicely bridge the shorter dropouts, and in total only a few minutes of data have been lost on kafka, while udp2log was shaky for up to 2 hours.
* Revoked default Push grant for Analytics on gerrit's analytics/* projects
Per default, all Analytics members had Push permission on all of gerrit's analytics/* project. As accidental pushes caused pain again, we now revoked the default Push grant, and made sure that our bots still had necessary permission to do their duty.
* Wikimetrics showing many requests to internal files
A fix for the mis-redirection of those monitoring requests has been implemented (but it's not yet deployed).
* Counting pageviews for the pages “undefined” / “Undefined” (Bug 66532)
A short increase on requests for the pages “undefined” and “Undefined” impacted pageview trend graphs. So after the initial push-back that bug 66532 received, it was picked up again, and we prepared patches for both the C and Hive implementation of webstatscollector's pageview definition to not count such requests. Deployment of those patches is likely to happen around 2014-10-15.
* Counting redirect pageviews for Webstatscollector (Bug 71790)
Ever since, the webstatscollector pageview definition has been counting redirects, and was hence overcounting. Since, we're about to deploy a webstatscollector anyways, we prepared changes to fix this longstanding miscounting.
* Reworking webstatscollector's build system
Fresh compilations of webstatscollector's C implementation gave executables that segfaulted. So we fixed some NULL dereferences, fixed the build system, made it capable of compiling with optimization turned on, and built a rudimentary testsuite for the collector process. Thereby, we can now again build the collector executable, and can automatically verify that it's working.
* Puppetization of MaxMind's Connection Type databases
MaxMind's Connection Type (NetSpeed) databases have been puppetized. They are available for example on stat1002, and stat1003 at
/usr/share/GeoIP/GeoIPNetSpeedCell.dat /usr/share/GeoIP/GeoIPNetSpeed.dat
.
* Wikihadoop now available on the Analytics Cluster
This allows for easier parsing of Mediawiki xml revision dumps.
* Analytics Mini-Hackathon in San Francisco
During this week, the Analytics Mini-Hackathon took place, and more prototyping around ** Scoop and Oozification ** Streaming data into HDFS happened, and some time was spend on hunting down the kafkatee issues.