[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

31 Mar 2023


      On Thu, Mar 30, 2023 at 4:28 AM Erik Moeller eloquence@gmail.com wrote:
...
If you want to impose _additional restrictions_ on a person for stuff
they download from you, that actually requires proactive agreement
from the user to those restrictions at the time they download the
thing.
If you don't obtain this agreement, you cannot meaningfully enforce
the "license" because the downloader never agreed to it in the first
place. Moreover, you'll have to make sure that _everyone else making
copies of the file_ also obtains agreement from people getting those
copies, or your whole house of cards falls down.
Isn't that exactly how we impose attribution and share-alike
requirements of CC-BY-SA content?
On Thu, Mar 30, 2023 at 4:25 AM Kimmo Virtanen
kimmo.virtanen@wikimedia.fi wrote:
...
...
To generate or disseminate information or content, in any context (e.g. posts, articles, tweets, chatbots or other kinds of automated bots) without expressly
and intelligibly disclaiming that the text is machine generated
This makes it useless in most content-related use cases as it requires too much extra text to use the results.
I guess that the General Disclaimer could serve to fulfill that requirement.
...
About FOSS compatible LLMs, EleutherAI's GPT-J, NeoX, and Pythia
and Cerebras-GPT are under Apache 2.0. The question is whether these
models are good enough to be useful. However, the same question is
relevant to Bloom too.
I have no particular affinity to BLOOM, but I have been able to
personally test that it is capable of at least a dozen different use
cases that people have shown GPT-3 and ChatGPT can be used for on
enwiki. My promotion of leveraging it is for the strictly utilitarian
purpose of providing an infrastructure to work on the problems which
seem to have the greatest risk to project content if not addressed.
I would prefer a more widely multilingual model trained on all of the
Foundation content suitable for that purpose, but training such models
is a much more expensive proposition than merely using them.
-LW

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia