Hi,
in the week from 2014-08-25–2014-08-31 Andrew, Jeff, and I worked on
the following items around the Analytics Cluster and Analytics related
Ops:
* Analytics cluster feeding more logs into logstash
* More buffer for kafka brokers
* Life support for webstatscollector on udp2log
* Webstatscollector and kafka
* Webstatscollector counting https requests from ulsfo twice
(details below)
Have fun,
Christian
* Analytics cluster feeding more logs into logstash
The analytics cluster previously only fed logs from the worker nodes
into logstash, and now also feeds logs from namenodes into logstash.
* More buffer for kafka brokers
During partition leader re-elections, kafka brokers sometimes drop
a few log lines. Since the kafka broker buffers were smaller than the
time the re-election might take, the buffer size was increased, which
could help brokers to handle a partition leader re-election without
dropping messages.
* Life support for webstatscollector on udp2log
The production webstatscollector (the software that produces the
hourly pageview files, that are used for example by
stats.wikimedia.org, and stats.grok.se) that consumes from udp2log
started to produce faulty files. As another, no longer needed service
on the host that runs part of webstatscollector was greedy around
resources, this no longer needed service has been stopped to free up
more resources. Strangely enough, those additional resources made
webstatscollector misbehave even more. Disks could no longer handle
the load. After moving the service to writing to a RAM disk, the host
could handle the write load again. This switch not only allowed to
bring webstatscollector back to life, but also decreased packet loss
on the collector by a bit more than an order of magnitude.
* Webstatscollector and kafka
Last week we reported that we spun up a webstatscollector instance
that consumes from kafka instead of udp2log, and that the setup caused
some issues at first. We now monitored the “webstatscollector on
kafka” setup for a week, and it was producing the data extremely
reliably. So with this webstatscollector on kafka, we have a good
baseline to compare against when trying to scale up webstatscollector
to Hadoop.
* Webstatscollector counting https requests from ulsfo twice
While working on establishing the “webstatscollector on kafka”
baseline, it has been discovered that the udp2log webstatscollector
counts https requests from ulsfo twice. The corresponding fix has been
merged on the same day, but due to “no deploys on Fridays” the deploy
did not happen last week. (It has been deployed since, and numbers
look good)
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage:
http://quelltextlich.at/
---------------------------------------------------------------