On Mon, Jan 10, 2022 at 4:23 PM Roy Smith <roy(a)panix.com> wrote:
I'm starting on a project which will process every edit comment in the enwiki dumps
using k8s jobs on toolforge. There's 736
enwiki-20211201-pages-meta-history*.xml*bz2 files. Would kicking off 736 jobs (one per
file) be a reasonable thing to do from the resource consumption point of view, or would
that lead to a WTF response from the sysadmins?
The Toolforge Kubernetes cluster tries to protect itself from runaway
resource consumption via quota limits. You can see the quotas for your
tool by running `kubectl describe quota` from a bastion. The default
concurrent jobs quota is 15, but you are probably going to run out of
other quota limited resources (cpu, memory, pods) long before you get
to the point of 15 concurrent jobs.
To your specific question, yes if you managed to spawn 736
simultaneous dump processing jobs it would almost certainly lead to
resource starvation across the entirety of the Toolforge Kubernetes
cluster and make many sad SREs.
Bryan
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808