Phew, ok, things did go wrong! We ran into a couple of bugs recently introduced in Yarn and in Hive that took us a while to find work arounds. Jobs are again flowing through the cluster. However, jobs have been lagging behind since they haven’t been able to run all day. They should eventually catch up. For now, the cluster is back open for business, but I’d appreciate if no one ran any heavy jobs until tomorrow.
Also, it is still possible we may run into other issues we haven’t yet seen, so I can’t guarantee that I won’t have to restart things again.
Anyway, aside from those hiccups. CDH 5.4.0 is now installed, Hive 1.1 and Spark 1.3.0 are now available, weeeeee!
-Ao
On May 4, 2015, at 11:05, Andrew Otto aotto@wikimedia.org wrote:
Hi all, as a reminder, I will be doing this upgrade today. Within the next hour I will turn off the Hadoop cluster. Please do not attempt to use it again until I notify you again.
Thanks! -AO
On Apr 29, 2015, at 14:57, Robert West west@cs.stanford.edu wrote:
All good!
On Wed, Apr 29, 2015 at 11:35 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
- the right research list (Andrew, remove wmfresearch@ from your contact
list :P )
All looks good to me. Thanks. :)
On Wed, Apr 29, 2015 at 1:11 PM, Leila Zia leila@wikimedia.org wrote:
FYI
Ashwin, Bob, Ellery, I don't anticipate this having negative impact on our workflow. If you see possible issues, please communicate with Andrew (cc-ing me), or let me know and I communicate. Thanks!
---------- Forwarded message ---------- From: Andrew Otto aotto@wikimedia.org Date: Wed, Apr 29, 2015 at 11:05 AM Subject: [wmfresearch] Hadoop Cluster Downtime To: Operations Engineers ops@lists.wikimedia.org, "A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics." analytics@lists.wikimedia.org, "wmfresearch@lists.wikimedia.org Research" wmfresearch@lists.wikimedia.org
Hi all!
CDH 5.4 is out[1] and we’d like to upgrade. We are doing this now, rather than later, because there is an important Parquet/Hive related bug that has been fixed in this version[2]. This upgrade will include Spark 1.3, which should at least make one researcher happy.
To do this upgrade, I need to schedule some downtime for Hadoop. I’d like to do this on Monday May 4th. I expect the upgrade to take me no more than an hour or two, but just to be safe I’d like to schedule the downtime for the whole day.
If anyone has critical things that they absolutely have to run on Monday, let me know now and I will find another day.
Thanks! -Ao
[1] http://blog.cloudera.com/blog/2015/04/cloudera-enterprise-5-4-is-released/ [2] https://issues.apache.org/jira/browse/HIVE-9482
wmfresearch mailing list wmfresearch@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wmfresearch
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
-- Up for a little language game? -- http://www.unfun.me
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops