On Sat, Apr 1, 2023 at 11:36 PM Erik Moeller eloquence@gmail.com wrote:
Openly licensed models for machine translation like Facebook's M2M (https://huggingface.co/facebook/m2m100_418M) or text generation like Cerebras-GPT-13B (https://huggingface.co/cerebras/Cerebras-GPT-13B) and GPT-NeoX-20B (https://huggingface.co/EleutherAI/gpt-neox-20b) seem like better targets for running on Wikimedia infrastructure, if there's any merit to be found in running them at this stage.
Note that Facebook's proprietary but widely circulated LLaMA model has triggered a lot of work on dramatically improving performance of LLMs through more efficient implementations, to the point that you can run a decent quality LLM (and combine it with OpenAI's freely licensed voice detection model) on a consumer grade laptop:
https://github.com/ggerganov/llama.cpp
While I'm not sure if the "hallucination" problem is tractable when all you have is an LLM, I am confident (based on, e.g., the recent results with Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html) that the performance of smaller models will continue to increase as we find better ways to train, steer, align, modularize and extend them.
to host open models like above would be really cool for multiple reasons, the most important one to bring back the openess into the training, besides the many voices out of the movement considering various social aspects one would never have the idea of otherwise.
rupert