Since proposals which don't fit in to existing discussions elsewhere are on
topic here, I want to boldly recommend the following while the annual
planning process is still ongoing, because it's far beyond the scope of
what could be accomplished at a hackathon or on WMCS in a responsible
fashion:
First, the Foundation should host a fork of BLOOM [
https://huggingface.co/bigscience/bloom ], which if I remember correctly
was described by the Foundation's Machine Learning Director Chris Albon as
the only LLM at the scale of GPT-3 adhering to the movement's FOSS
criteria. This should be done under or alongside Toolforge on Wikimedia
Cloud Services so that staff and volunteers alike may use its API and
submit modification proposals for new instances. Presumably this would cost
on the order of $100,000 per year per instance, according to
https://huggingface.co/bigscience/bloom/discussions/161#63a33373b5fc9ab9f63…
but someone should double-check that math. I've tested BLOOM against a
dozen of the uses shown around enwiki for GPT-3 and ChatGPT, and it seems
to perform about as well. (You can use the Hosted Inference API version on
Azure for free at the Huggingface URL.)
Secondly, the Foundation should sponsor staff-, grant-, affiliate-, and
volunteer-run projects to replicate and extend the work on:
A. RARR [
https://arxiv.org/abs/2210.08726 ] and other methods of
attribution and verification with goals aspiring to Wikipedia's standards
of summarizing and citing sources in ways that can be independently
verified.
B. ROME [
https://rome.baulab.info/ / MEMIT:
https://memit.baulab.info/ ]
and other approaches to knowledge editing in language models with the goal
of producing simple interfaces to provide "language models that anyone can
edit" and ideally coupled to Wikidata updates.
C. EditEval [
https://eval.ai/web/challenges/challenge-page/1866/overview
], an ongoing challenge competition to produce systems capable of
automatically improving text, including its fluency, simplification,
paraphrasing, neutralization, and updating information.
I apologize to those on Thursday's Zoom call who had proposals for ORES
expansion to combat paid advocacy, images, audio, speech and video, as I
don't remember enough of the details and there's not enough information at
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/…
to include them here. I hope the advocates will elucidate those proposals
on list while the annual planning process is still in progress.
-LW
On Mon, Mar 27, 2023 at 1:04 PM Yael Weissburg <rweissburg(a)wikimedia.org>
wrote:
Hello again everyone,
Thanks again to those who made it to the call last week - it felt like
such a luxury to be able to drop deeply into this subject for an hour
(plus) with all of you.
For those who were unable to join, we captured extensive notes on Meta
<https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes>.
I hope we continue the vibrant discussion we started together on the Talk
Page. Maybe someone can use that space to volunteer to host the next call?
I know many folks are eager to continue the live discussion too.
I also wanted to share a few links / resources that might be useful (I'll
add these to the Talk page as well):
- WMF's Legal team recently did a copyright analysis of ChatGPT. You
can find that on Meta
<https://meta.wikimedia.org/wiki/Wikilegal/Copyright_Analysis_of_ChatGPT>
.
- There is a proposed session on ChatGPT / generative AI for the
Wikimedia Hackathon in May. You can find that on Phabricator
<https://phabricator.wikimedia.org/T333127>.
Finally, a huge thank you to @Maryana Pinchuk <mpinchuk(a)wikimedia.org> who
took the extensive and detailed notes on the call and also did a lot of
"wrangling" behind the scenes to help draft the External Trends in the
first place and get us to a point where we could have this discussion.
Thank you, Maryana!
Feel free to reach out anytime to connect about this or other topics. I'll
be in Belgrade for the EduWiki conference in May and Singapore for
Wikimania - if you're coming to either of those events or in the area, let
me know - I'd love to meet in person!
Best,
Yael
*Yael Weissburg* (she/her)
VP, Partnerships, Programs & Grantmaking
Wikimedia Foundation <https://wikimediafoundation.org/>
M: (+1) 415.513.6643
I work from San Francisco. My time zone is UTC -7/-8.
On Fri, Mar 24, 2023 at 2:02 AM Paulo Santos Perneta <
paulosperneta(a)gmail.com> wrote:
Yes, please, make this a regular event, at least
for the time being.
These discussions are incredibly useful, given the speed the developments
are happening in this area, and the complexity of the challenges we are
facing due to them.
And thank's a lot for organizing the meeting yesterday!
Paulo
Samuel Klein <meta.sj(a)gmail.com> escreveu no dia quinta, 23/03/2023 à(s)
21:11:
The Bau lab (that produced ROME) is great; see
their update MEMIT
https://memit.baulab.info scaling that approach.
On Thu, Mar 23, 2023 at 3:43 PM Lauren Worden <laurenworden89(a)gmail.com>
wrote:
On Thu, Mar 23, 2023 at 12:20 PM Samuel Klein
<meta.sj(a)gmail.com>
wrote:
> Thanks Yael and all for hosting this! A great conversation which we
> should revisit regularly.
>
Yes, I hope that this can be a (monthly?) regularly occurring event
given the current state of very substantial advancements and improvements
in the field.
I want to reiterate some links which I feel may be of considerable help
to those trying to understand our situation:
RARR:
https://arxiv.org/abs/2210.08726
ROME:
https://rome.baulab.info/
ROME:
-LW
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l(a)lists.wikimedia.org,
guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
and
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org…
To unsubscribe send an email to wikimedia-l-leave(a)lists.wikimedia.org
--
Samuel Klein @metasj w:user:sj +1 617 529
4266
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l(a)lists.wikimedia.org, guidelines
at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org…
To unsubscribe send an email to wikimedia-l-leave(a)lists.wikimedia.org
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l(a)lists.wikimedia.org, guidelines
at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org…
To unsubscribe send an email to wikimedia-l-leave(a)lists.wikimedia.org
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l(a)lists.wikimedia.org, guidelines
at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org…
To unsubscribe send an email to wikimedia-l-leave(a)lists.wikimedia.org