Hi,
in the week from 2014-10-20–2014-10-26 Andrew, Jeff, and I worked on
the following items around the Analytics Cluster and Analytics related
Ops:
* Research on columnar storage in the cluster
* Research on how to count of access to media files
* Rolling out ACK tuning for varnishkafka
* More work towards getting application id into logstash
(details below)
Have fun,
Christian
* Research on columnar storage in the cluster
Columnar storage engines can help to speed up some queries we're
running and plan to run. So some more research around Parquet and AVRO
was done, and how xmldumps imports could benefit them.
* Research on how to count of access to media files
We had many requests making access counts for media files
public. Since the basic infrastructural ingredients are within reach,
we started to explore what would be doable towards getting such data
public.
* Rolling out ACK tuning for varnishkafka
As reported for the previous week, the ACK tuning of varnishkafka
showed to avoid message loss during leader elections. So we're
incrementally deploying the new ACK parameter to caches, and 3 out of
4 clusters are using it already. The deployment for the fourth cluster
is still pending.
* More work towards getting application id into logstash
Repackaging jars to inject the log4j configurations allowed to get
more logs into logstash. And we're also starting to extract
application ids from log messages, which will finally allow to go to
logstash to get and filter logs for the applications (like Hive
queries) one is running on the cluster.
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage:
http://quelltextlich.at/
---------------------------------------------------------------