Hey everybody! Things are better now! Cluster is caught up. We’ve also put in place some fancier queuing to ensure that production jobs aren’t bogged down.
ALSO: The webrequest table now has some new fields! client_ip, geocoded_data and record_version. WooT! This data will only be filled in for new partitions. It should be present for everything beginning at 2015-02-26T18:00. Anything before that will not have these fields. Also note that you can no longer use SELECT * on data older than this. This is a technical consequence of the way we import the new data.
Thanks so much Christian and Joseph!
-Ao
On Feb 26, 2015, at 12:13, Toby Negrin tnegrin@wikimedia.org wrote:
Thank you Christian!
On Wed, Feb 25, 2015 at 5:18 PM, Christian Aistleitner <christian@quelltextlich.at mailto:christian@quelltextlich.at> wrote: Hi,
just a quick heads up that the Analytics cluster got stuck today. And jobs deadlocked themselves waiting for other jobs to free resources.
For the time being, to allow the cluster to catch up for the missed hours, I suspended the refining jobs.
This gives the cluster enough resources to catch up with importing the kafka data that it missed during the day.
But this also means that the datasets: pagecounts-all-sites, pagecounts-raw, legacy_tsvs will fall behind a bit, and the wmf.webrequest data will not see new data while the cluster is catching up.
Tomorrow, in the European morning when the cluster has caught up, I'll enable refining again, and the datasets should catch up again.
Sorry for the inconveniences, Christian
P.S.: Suspending refining looks a bit drastic. But if we only killed the resource hungry jobs without stopping refining, refining would start during the catch up of camus and produce faulty datasets. Hence, we suspended refining for now. Tomorrow, we'll resume the suspended jobs and have the datasets catch up again.
P.P.S.: If you have resource hungry jobs on the Analytics cluster, if possible please wait until tomorrow to run them.
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at mailto:christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 tel:%2B43%207946%20%2F%2020%205%2081 Fax: +43 7946 / 20 5 81 tel:%2B43%207946%20%2F%2020%205%2081 Homepage: http://quelltextlich.at/ http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics