Hi,
in the week from 2014-09-01–2014-09-07 Andrew, Jeff, and I worked on
the following items around the Analytics Cluster and Analytics related
Ops:
* Investigating ways to allow queries across MediaWiki and Hadoop databases
* Deployment of webstatscollector's ulsfo https fix
* Re-run reports due to slave lag
* X-Analytics tag for used PHP engine
* Digging deeper into analytics1021 issues
(details below)
Have fun,
Christian
* Investigating ways to allow queries across MediaWiki and Hadoop databases
Currently data from Hadoop is fully separated from the our wiki's
databases, which it hard to query across the two different kinds of
databases, and hence makes researcher's life harder. Of the available
solutions to overcome this issue, Scoop seems like a suitable
approach. Scoop allows to import data from MediaWiki databases into
HDFS, and query them from within Hadoop. It was looked at how Scoop
imports work, and discussions were started with researchers on which
imports would be useful and which would not.
* Deployment of webstatscollector's ulsfo https fix
The fix that stops webstatscollector to count ulsfo https requests
twice got deployed.
* Re-run reports due to slave lag
The annonced schema changes caused more slave lag than some reports
could cope with, so we had to re-run a few reports by hand to make up
for the slave lag.
* X-Analytics tag for used PHP engine
Ops added a “php” tag to the X-Analytics header. This header allows to
identify which PHP implementation got used to serve requests.
* Digging deeper into analytics1021 issues
Despite the recent buffer increases, analytics1021 still from time to
time fails to act as proper partition leader. Since the failure is not
reproducible manually, debugging is tricky ... and time consuming. We
added some more monitoring, and waited for the issue to re-appear. It
seems that from time to time bursts of disk writes free up lots memory
on analytics1021. During these write-out phases, the processes on
analytics are getting starved. If starvation takes to long,
analytics1021 gets (correctly) kicked out of the partition leader
role. We now need to find the source of those write bursts, to see if
they are the real issue, or just the symptom of a different issue.
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage:
http://quelltextlich.at/
---------------------------------------------------------------