Am Di., 8. Okt. 2024 um 18:57 Uhr schrieb YiFei Zhu <zhuyifei1999@gmail.com>:On Tue, Oct 8, 2024 at 8:59 AM Bryan Davis <bd808@wikimedia.org> wrote:
>
> I was asked recently what I knew about the types of tools that use
> data from the https://dumps.wikimedia.org/ project. I had to admit
> that I really didn't know of many tools off the top of my head that
> relied on dumps. Most of the use cases I have heard about are for
> research topics like looking at word frequencies and sentence
> complexity, or machine learning things that consume some or all of the
> wiki corpus.
>
> Do you run a tool that needs data from Dumps to do its job? I would
> love to hear some stories about how this data helps folks advance the
> work of the movement.
YiFeiBot uses dumps to find a list of pages with interlanguage links,
for the interlanguage link removal task. It does this by processing
each page's wikitext through a regex.
> Bryan
> --
> Bryan Davis Wikimedia Foundation
> Principal Software Engineer Boise, ID USA
> [[m:User:BDavis_(WMF)]] irc: bd808
> _______________________________________________
> Cloud mailing list -- cloud@lists.wikimedia.org
> List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/