Hi everybody,
We have upgraded the AMD ROCm stack for our GPUs to 4.2 (it is not the latest upstream but close to it). There are two main things to know:
- If you are using tensorflow-rocm on stat100[5,8], please upgrade it to version 2.5.0 (that is now the only version supported, previously it was 2.3.1). - A new package was added to support the ONNX framework (see https://phabricator.wikimedia.org/T287267)
All details added to https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU as well.
Enjoy the new stack and let me know in the aforementioned task if you encounter any issue or if you have any questions.
Luca
analytics-announce@lists.wikimedia.org