Oh I don't know where to even start:

AI/ML done by me:
General research done by others:
Other projects I remember from top of my head:
Hope that helps

Am Di., 8. Okt. 2024 um 18:06 Uhr schrieb Kimmo Virtanen <kimmo.virtanen@wikimedia.fi>:
I am from time to time using dumps for parsing data that I cannot get via SQL/API. For example in summer I fetched Wikimedia Commons page history for getting the list of old categories of images so that I would not be re-inserting categories by bot which were least once removed from the photo.

 Br,
-- Kimmo Virtanen, Zache

On Tue, Oct 8, 2024 at 6:59 PM Bryan Davis <bd808@wikimedia.org> wrote:
I was asked recently what I knew about the types of tools that use
data from the https://dumps.wikimedia.org/ project. I had to admit
that I really didn't know of many tools off the top of my head that
relied on dumps. Most of the use cases I have heard about are for
research topics like looking at word frequencies and sentence
complexity, or machine learning things that consume some or all of the
wiki corpus.

Do you run a tool that needs data from Dumps to do its job? I would
love to hear some stories about how this data helps folks advance the
work of the movement.

Bryan
--
Bryan Davis                                        Wikimedia Foundation
Principal Software Engineer                               Boise, ID USA
[[m:User:BDavis_(WMF)]]                                      irc: bd808
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


--
Amir (he/him)