Without wanting to take away from this thread--folks who respond here, could you please also have a look at T365693: MediaWiki Dumps XML - Provide attribute to indicate that user is temporary account in exported content and add your feedback? (Or just reply to this list and I'll make sure it's captured on the task.)

Thanks,
Kosta

On 8. Oct 2024 at 20:07:05, Sascha Brawer via Cloud <cloud@lists.wikimedia.org> wrote:
QRank uses dumps (plus access logs) to compute a ranking signal for Wikidata items.

— Sascha

Am Di., 8. Okt. 2024 um 18:57 Uhr schrieb YiFei Zhu <zhuyifei1999@gmail.com>:
On Tue, Oct 8, 2024 at 8:59 AM Bryan Davis <bd808@wikimedia.org> wrote:
>
> I was asked recently what I knew about the types of tools that use
> data from the https://dumps.wikimedia.org/ project. I had to admit
> that I really didn't know of many tools off the top of my head that
> relied on dumps. Most of the use cases I have heard about are for
> research topics like looking at word frequencies and sentence
> complexity, or machine learning things that consume some or all of the
> wiki corpus.
>
> Do you run a tool that needs data from Dumps to do its job? I would
> love to hear some stories about how this data helps folks advance the
> work of the movement.

YiFeiBot uses dumps to find a list of pages with interlanguage links,
for the interlanguage link removal task. It does this by processing
each page's wikitext through a regex.

> Bryan
> --
> Bryan Davis                                        Wikimedia Foundation
> Principal Software Engineer                               Boise, ID USA
> [[m:User:BDavis_(WMF)]]                                      irc: bd808
> _______________________________________________
> Cloud mailing list -- cloud@lists.wikimedia.org
> List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/