I'm curious what the actual question is. The basic concepts are studied for about 60 years, and are in use for about 20 to 30 years. One particular detail the industry apparently needs to re-learn every time is how easily such vector spaces encode and reproduce any existing bias, racism, phobia, and so on, and how hard it is to raise awareness, despite doing something about it.
That said, the Elasticsearch technology we currently use on Wikimedia infrastructure in version 7.10.x is already responding to the current machine learning hype cycle.
https://www.elastic.co/de/blog/introducing-approximate-nearest-neighbor-sear... https://en.wikipedia.org/wiki/Special:Version
We certainly need to update some day, but I think nobody is actively working on this at the moment. However, the topic appears in the currently discussed annual plan. The responsible Search Platform team is also quite active and monitors a good selection of communication channels, including a separate mailing list.
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/D... https://wikitech.wikimedia.org/wiki/Search_Platform/Contact#Office_Hours
Kind regards Thiemo