On Thu, Aug 3, 2023 at 7:16 AM Chris Albon <calbon(a)wikimedia.org> wrote:
Hi everybody,
TL;DR We would like users of ORES models to migrate to our new open source
ML infrastructure, Lift Wing, within the next five months. We are available
to help you do that, from advice to making code commits. It is important to
note: All ML models currently accessible on ORES are also currently
accessible on Lift Wing.
As part of the Machine Learning Modernization Project (
https://www.mediawiki.org/wiki/Machine_Learning/Modernization), the
Machine Learning team has deployed a Wikimedia’s new machine learning
inference infrastructure, called Lift Wing (
https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing). Lift Wing
brings a lot of new features such as support for GPU-based models, open
source LLM hosting, auto-scaling, stability, and ability to host a larger
number of models.
This sounds quite exciting! What's the best place to read up on that
planned support for GPU-based models and open source LLMs? (I also saw in
the recent NYT article[1] that the team is "in the process of adapting A.I.
models that are 'off the shelf; — essentially models that have been made
available by researchers for anyone to freely customize — so that
Wikipedia’s editors can use them for their work.")
I'm aware of the history[2] of not being able to use NVIDIA GPUs due to
their CUDA drivers being proprietary. It was mentioned recently in the
Wikimedia AI Telegram group that this is still a serious limitation,
despite some new explorations with AMD GPUs[3] - to the point that e.g. the
WMF's Language team has resorted to using models without GPU support (CPU
only).[4]
It sounds like there is reasonable hope that this situation could change
fairly soon? Would it also mean both at the same time, i.e. open source
LLMs running with GPU support (considering that at least some
well-known ones appear to require torch.cuda.is_available() == True for
that)?
Regards, Tilman
[1]
https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html
[2]
https://techblog.wikimedia.org/2020/04/06/saying-no-to-proprietary-code-in-…
[3]
https://phabricator.wikimedia.org/T334583 etc.
[4]
https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages…
or
https://thottingal.in/blog/2023/07/21/wikiqa/ (experimental but, I
understand, written to be deployable on WMF infrastructure)
With the creation of Lift Wing, the team is turning its attention to
deprecating the current machine learning infrastructure, ORES. ORES served
us really well over the years, it was a successful project but it came
before radical changes in technology like Docker, Kubernetes and more
recently MLOps. The servers that run ORES are at the end of their planned
lifespan and so to save cost we are going to shut them down in early 2024.
We have outlined a deprecation path on Wikitech (
https://wikitech.wikimedia.org/wiki/ORES), please read the page if you
are a maintainer of a tool or code that uses the ORES endpoint
https://ores.wikimedia.org/). If you have any doubt or if you need
assistance in migrating to Lift Wing, feel free to contact the ML team via:
- Email: ml(a)wikimedia.org
- Phabricator: #Machine-Learning-Team tag
- IRC (Libera): #wikimedia-ml
The Machine Learning team is available to help projects migrate, from
offering advice to making code commits. We want to make this as easy as
possible for folks.
High Level timeline:
**By September 30th 2023: *Infrastructure powering the ORES API endpoint
will be migrated from ORES to Lift Wing. For users, the API endpoint will
remain the same, and most users won’t notice any change. Rather just the
backend services powering the endpoint will change.
Details: We'd like to add a DNS CNAME that points
ores.wikimedia.org to
ores-legacy.wikimedia.org, a new endpoint that offers a almost complete
replacement of the ORES API calling Lift Wing behind the scenes. In an
ideal world we'd migrate all tools to Lift Wing before decommissioning the
infrastructure behind
ores.wikimedia.org, but it turned out to be really
challenging so to avoid disrupting users we chose to implement a transition
layer/API.
To summarize, if you don't have time to migrate before September to Lift
Wing, your code/tool should work just fine on
ores-legacy.wikimedia.org
and you'll not have to change a line in your code thanks to the DNS CNAME.
The ores-legacy endpoint is not a 100% replacement for ores, we removed
some very old and not used features, so we highly recommend at least test
the new endpoint for your use case to avoid surprises when we'll make the
switch. In case you find anything weird, please report it to us using the
aforementioned channels.
**September to January: *We will be reaching out to every user of ORES we
can identify and working with them to make the migration process as easy as
possible.
**By January 2024: *If all goes well, we would like zero traffic on the
ORES API endpoint so we can turn off the ores-legacy API.
If you want more information about Lift Wing, please check
https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing
Thanks in advance for the patience and the help!
Regards,
The Machine Learning Team
_______________________________________________
Wikitech-l mailing list -- wikitech-l(a)lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave(a)lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/