[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

2 Apr 2023


      On Sat, Apr 1, 2023 at 11:36 PM Erik Moeller eloquence@gmail.com wrote:
...
Openly licensed models for machine translation like Facebook's M2M
(https://huggingface.co/facebook/m2m100_418M) or text generation like
Cerebras-GPT-13B (https://huggingface.co/cerebras/Cerebras-GPT-13B)
and GPT-NeoX-20B (https://huggingface.co/EleutherAI/gpt-neox-20b) seem
like better targets for running on Wikimedia infrastructure, if
there's any merit to be found in running them at this stage.
Note that Facebook's proprietary but widely circulated LLaMA model has
triggered a lot of work on dramatically improving performance of LLMs
through more efficient implementations, to the point that you can run
a decent quality LLM (and combine it with OpenAI's freely licensed
voice detection model) on a consumer grade laptop:
https://github.com/ggerganov/llama.cpp
While I'm not sure if the "hallucination" problem is tractable when
all you have is an LLM, I am confident (based on, e.g., the recent
results with Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html)
that the performance of smaller models will continue to increase as we
find better ways to train, steer, align, modularize and extend them.
to host open models like above would be really
cool for multiple reasons, the most important one to bring
back the openess into the training, besides the many
voices out of the movement considering various social
aspects one would never have the idea of otherwise.
rupert

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia