Since proposals which don't fit in to existing discussions elsewhere are on topic here, I want to boldly recommend the following while the annual planning process is still ongoing, because it's far beyond the scope of what could be accomplished at a hackathon or on WMCS in a responsible fashion:
First, the Foundation should host a fork of BLOOM [ https://huggingface.co/bigscience/bloom ], which if I remember correctly was described by the Foundation's Machine Learning Director Chris Albon as the only LLM at the scale of GPT-3 adhering to the movement's FOSS criteria. This should be done under or alongside Toolforge on Wikimedia Cloud Services so that staff and volunteers alike may use its API and submit modification proposals for new instances. Presumably this would cost on the order of $100,000 per year per instance, according to https://huggingface.co/bigscience/bloom/discussions/161#63a33373b5fc9ab9f63d... but someone should double-check that math. I've tested BLOOM against a dozen of the uses shown around enwiki for GPT-3 and ChatGPT, and it seems to perform about as well. (You can use the Hosted Inference API version on Azure for free at the Huggingface URL.)
Secondly, the Foundation should sponsor staff-, grant-, affiliate-, and volunteer-run projects to replicate and extend the work on:
A. RARR [ https://arxiv.org/abs/2210.08726 ] and other methods of attribution and verification with goals aspiring to Wikipedia's standards of summarizing and citing sources in ways that can be independently verified.
B. ROME [ https://rome.baulab.info/ / MEMIT: https://memit.baulab.info/ ] and other approaches to knowledge editing in language models with the goal of producing simple interfaces to provide "language models that anyone can edit" and ideally coupled to Wikidata updates.
C. EditEval [ https://eval.ai/web/challenges/challenge-page/1866/overview ], an ongoing challenge competition to produce systems capable of automatically improving text, including its fluency, simplification, paraphrasing, neutralization, and updating information.
I apologize to those on Thursday's Zoom call who had proposals for ORES expansion to combat paid advocacy, images, audio, speech and video, as I don't remember enough of the details and there's not enough information at https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/D... to include them here. I hope the advocates will elucidate those proposals on list while the annual planning process is still in progress.
-LW
On Mon, Mar 27, 2023 at 1:04 PM Yael Weissburg rweissburg@wikimedia.org wrote:
Hello again everyone,
Thanks again to those who made it to the call last week - it felt like such a luxury to be able to drop deeply into this subject for an hour (plus) with all of you.
For those who were unable to join, we captured extensive notes on Meta https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes. I hope we continue the vibrant discussion we started together on the Talk Page. Maybe someone can use that space to volunteer to host the next call? I know many folks are eager to continue the live discussion too.
I also wanted to share a few links / resources that might be useful (I'll add these to the Talk page as well):
- WMF's Legal team recently did a copyright analysis of ChatGPT. You
can find that on Meta https://meta.wikimedia.org/wiki/Wikilegal/Copyright_Analysis_of_ChatGPT .
- There is a proposed session on ChatGPT / generative AI for the
Wikimedia Hackathon in May. You can find that on Phabricator https://phabricator.wikimedia.org/T333127.
Finally, a huge thank you to @Maryana Pinchuk mpinchuk@wikimedia.org who took the extensive and detailed notes on the call and also did a lot of "wrangling" behind the scenes to help draft the External Trends in the first place and get us to a point where we could have this discussion. Thank you, Maryana!
Feel free to reach out anytime to connect about this or other topics. I'll be in Belgrade for the EduWiki conference in May and Singapore for Wikimania - if you're coming to either of those events or in the area, let me know - I'd love to meet in person!
Best,
Yael
*Yael Weissburg* (she/her) VP, Partnerships, Programs & Grantmaking Wikimedia Foundation https://wikimediafoundation.org/ M: (+1) 415.513.6643 I work from San Francisco. My time zone is UTC -7/-8.
On Fri, Mar 24, 2023 at 2:02 AM Paulo Santos Perneta < paulosperneta@gmail.com> wrote:
Yes, please, make this a regular event, at least for the time being. These discussions are incredibly useful, given the speed the developments are happening in this area, and the complexity of the challenges we are facing due to them. And thank's a lot for organizing the meeting yesterday!
Paulo
Samuel Klein meta.sj@gmail.com escreveu no dia quinta, 23/03/2023 à(s) 21:11:
The Bau lab (that produced ROME) is great; see their update MEMIT https://memit.baulab.info scaling that approach.
On Thu, Mar 23, 2023 at 3:43 PM Lauren Worden laurenworden89@gmail.com wrote:
On Thu, Mar 23, 2023 at 12:20 PM Samuel Klein meta.sj@gmail.com wrote:
Thanks Yael and all for hosting this! A great conversation which we should revisit regularly.
Yes, I hope that this can be a (monthly?) regularly occurring event given the current state of very substantial advancements and improvements in the field.
I want to reiterate some links which I feel may be of considerable help to those trying to understand our situation:
RARR: https://arxiv.org/abs/2210.08726
ROME: https://rome.baulab.info/
ROME:
-LW _______________________________________________ Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/... To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
-- Samuel Klein @metasj w:user:sj +1 617 529 4266 _______________________________________________ Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/... To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/... To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/... To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org