Hi,
in the week from 2014-12-08–2014-12-14 Andrew, and I worked on the following items around the Analytics Cluster and Analytics related Ops:
* Stat1001 behind misc-web * Compression analysis for storing xmldumps in cluster * EventLogging replication lag
(details below)
Have fun, Christian
* Stat1001 behind misc-web
stat1001 (which handles stats.wikimedia.org, and datasets.wikimedia.org) got moved behind misc-web. This makes stat1001 use the WMF standard SSL setup, and removes certificate issues (Like T74805 [1]).
So URLs like
https://datasets.wikimedia.org/public-datasets/
(note the s in https) should finally work without warnings/errors.
* Compression analysis for making xmldumps available in cluster
More research around making xmldumps available in the custer has been done. The numbers can be found on
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/xmldumps#Results
* EventLogging replication lag
EventLogging replication got stuck. Only for some tables. This was a combination of EventLogging being liberal in what characters are allowed in table names, but the replication being very defensive. Sean made the blocked replication behave again (thanks!), and replication caught up. Restrictions on table naming got set up and are still getting tuned a bit.
[1] https://phabricator.wikimedia.org/T74805