[Wikitech-l] Re: ORES To Lift Wing Migration

8 Aug 2023

Hi Tilman!

On Tue, Aug 8, 2023 at 5:45 AM Tilman Bayer &lt;haebwiki(a)gmail.com&gt; wrote:

...

 Hi Chris,

 On Mon, Aug 7, 2023 at 11:51 AM Chris Albon &lt;calbon(a)wikimedia.org&gt; wrote:

  Hi Tilman,

 Most of the work is still very experimental. We have hosted a few LLMs on
 Lift Wing already (StarCoder for example) but they were just running on
 CPU, far too slow for real use cases. But it proves that we can easily host
 LLMs on Lift Wing. We have been pretty quiet about it while we focus on the
 ORES migration, but it is our next big project. More soon hopefully!
  Understood. Looking forward to learning more later!

  Where we are now is that we have budget for a big
GPU purchase (~10-20
 GPUs depending on cost), the question we will try to answer after the ORES
 migration is complete is: what GPUs should we purchase? We are trying to
 balance our strong preference to stay open source (i.e. AMD mROC) in a
 world dominated by a single closed source vendor (i.e. Nvidia). In
 addition, do we go for a few expensive GPUs better suited to LLMs (A1000,
 H100, etc) or a mix of big and small? We will need to figure out all this.
  I see. On that matter, what do you folks make of the recent announcements
 of AMD's partnerships with Hugging Face and Pytorch[5]? (which, I
 understand, came after the ML team had already launched the aforementioned
 new AMD explorations)

 "Open-source AI: AMD looks to Hugging Face and Meta spinoff PyTorch to
 take on Nvidia [...]
 Both partnerships involve AMD’s ROCm AI software stack, the company’s
 answer to Nvidia’s proprietary CUDA platform and application-programming
 interface. AMD called ROCm an open and portable AI system with
 out-of-the-box support that can port to existing AI models. [...B]oth AMD
 and Hugging Face are dedicating engineering resources to each other and
 sharing data to ensure that the constantly updated AI models from Hugging
 Face, which might not otherwise run well on AMD hardware, would be
 “guaranteed” to work on hardware like the MI300X. [...] AMD said PyTorch
 will fully upstream the ROCm software stack and “provide immediate ‘day
 zero’ support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD Instinct
 accelerators,” which is meant to appeal to those customers looking to
 switch from Nvidia’s software ecosystem."

 In their own announcement, Hugging Face offered further details, including
 a pretty impressive list of models to be supported:[6]

 "We intend to support state-of-the-art transformer architectures for
 natural language processing, computer vision, and speech, such as BERT,
 DistilBERT, ROBERTA, Vision Transformer, CLIP, and Wav2Vec2. Of course,
 generative AI models will be available too (e.g., GPT2, GPT-NeoX, T5, OPT,
 LLaMA), including our own BLOOM and StarCoder models. Lastly, we will also
 support more traditional computer vision models, like ResNet and ResNext,
 and deep learning recommendation models, a first for us. [..] We'll do our
 best to test and validate these models for PyTorch, TensorFlow, and ONNX
 Runtime for the above platforms. [...] We will integrate the AMD ROCm SDK
 seamlessly in our open-source libraries, starting with the transformers
 library."

 Do you think this may promise too much, or could it point to a possible
 solution of the Foundation's conundrum?

In https://phabricator.wikimedia.org/T334583 we experimented with LLMs and
AMD GPUs on Lift Wing, and we confirmed the good results that Pytorch
announced, We were able to run bloom-3b, bloom-560m, nllb-200 and falcon-7b
on Lift Wing, having issues only with the last one since the GPU VRAM was
not enough (16GB are low for Falcon-7b). So we can confirm that AMD ROCm
works really well with Pytorch :)

...
  In any case, this seems to be an interesting moment
where many in AI are
 trying to move away from Nvidia's proprietary CUDA platform.

This is my own view, not my team's, so I can't speak up for what the WMF
will decide, but I think we should keep going with AMD and avoid Nvidia as
much as possible. Our strong stand against proprietary software should
hold, even if it means more efforts and work to advance in the ML field. I
completely get the frustration when common libraries and tools have more
difficulty to run on AMD than Nvidia, but our communities should align (in
my opinion) to the most open source solution and contribute (where
possible) so that more and more people adopt the same.
Adding proprietary software to the WMF infrastructure and practices is also
something that is technically difficult for various reasons (from the Linux
Kernel maintenance to Debian package upload), meanwhile we already have
everything set up and working for AMD (that works nicely with our
infrastructure). Moreover Debian upstream has recently created a team to
maintain AMD ROCm packages (https://lists.debian.org/debian-ai/), so it
will be interesting to see what their direction will be (so far it seems
aligned to ours).

Thanks!

Luca

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: ORES To Lift Wing Migration