Hi,
around running jobs on the Analytics cluster, I've sometime seen
people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.
But more often than not, this seems to have meant:
“Let's just run this heavy job and wait. If QChris joins IRC, let's
hope he doesn't ping us about having overloaded the cluster.”
That's not nice^Wscalable ;-)
So just in case someone is vague on how to “keep an eye on it”, I did
a short write-up at:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load
which details on detecting how the cluster is doing on a very high
level.
Especially, it allows you to detect if the cluster got stalled, and if
it did, it tells you what to do.
Have fun,
Christian
P.S.: The above URL has diagrams! Click the URL!
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics