Hey Diego,
added a section at the end of the page with the info requested, let me know if anything is missing :)
Luca
Il giorno mar 18 feb 2020 alle ore 17:37 Diego Saez-Trumper < diego@wikimedia.org> ha scritto:
Thanks for this Luca.
I tend to use stat1007 because I know that machine has a lot of ram/cpu and HDFS access. From other statsX I'm not sure which of them have what resources (I know at least one of them doesn't have HDFS access). There is a table where I can look at a summary of resources per machine?
Thanks again.
On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano ltoscano@wikimedia.org wrote:
Hi everybody!
I created the following doc: https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nod...
It contains two FAQ:
- How do I ensure that there is enough space on disk before storing big
datasets/files ?
- How do I check the space used by my files/data on stat/notebook hosts ?
Please read them and let me know if anything is not clear or missing. We have plenty of space on stat100X hosts, but we tend to cluster on single machines like stat1007 for some reason, ending up in fighting for resources.
On a related note, we are going to work on unifying stat/notebook puppet configs in https://phabricator.wikimedia.org/T243934, so eventually all Analytics clients will be exactly the same.
Thanks!
Luca (on behalf of the Analytics team)
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics