Hi,
If you don't mind, please, starting next time, insert commas into those
huge counts. Without commas they are VERY difficult to read.
Thanks!
Sincerely,
Todd Shandelman
Austin, TX
On Sun, Aug 2, 2020, 07:01 <xmldatadumps-l-request(a)lists.wikimedia.org>
wrote:
> Send Xmldatadumps-l mailing list submissions to
> xmldatadumps-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
> or, via email, send a message with subject or body 'help' to
> xmldatadumps-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> xmldatadumps-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Xmldatadumps-l digest..."
>
>
> Today's Topics:
>
> 1. XML Dumps FAQ monthly update (noreply.xmldatadumps(a)wikimedia.org)
> 2. List of dumped wikis, discrepancy with Wikidata (Count Count)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 01 Aug 2020 16:07:36 +0000
> From: noreply.xmldatadumps(a)wikimedia.org
> To: xmldatadumps-l(a)lists.wikimedia.org
> Subject: [Xmldatadumps-l] XML Dumps FAQ monthly update
> Message-ID: <20200801160736.AneN_%noreply.xmldatadumps(a)wikimedia.org>
>
>
> Greetings XML Dump users and contributors!
>
> This is your automatic monthly Dumps FAQ update email. This update
> contains figures for the 20200701 full revision history content run.
>
> We are currently dumping 916 projects in total.
>
>
> ---------------------
> Stats for lmowiki on date 20200701
>
> Total size of page content dump files for articles, current content only:
> 151410097
>
> Total size of page content dump files for all pages, current content only:
> 179774126
>
> Total size of page content dump files for all pages, all revisions:
> 3555369968
> ---------------------
> Stats for enwiki on date 20200701
>
> Total size of page content dump files for articles, current content only:
> 78326324425
>
> Total size of page content dump files for all pages, current content only:
> 173926604054
>
> Total size of page content dump files for all pages, all revisions:
> 21045320844828
> ---------------------
>
>
> Sincerely,
>
> Your friendly Wikimedia Dump Info Collector
>
>
>
> ------------------------------
>
> Message: 2
> Date: Sun, 2 Aug 2020 00:04:22 +0200
> From: Count Count <countvoncount123456(a)gmail.com>
> To: xmldatadumps-l(a)lists.wikimedia.org
> Subject: [Xmldatadumps-l] List of dumped wikis, discrepancy with
> Wikidata
> Message-ID:
> <CAOHwkzAk6R+W4Xj673h=
> p44zxwX+22Pt+Zd3UBg_NbSUUTg+1w(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi!
>
> I am currently working on a dump search and download tool for all
> Wikimedia
> wikis. In order to find out which Wikimedia wikis exist I used Wikidata.
> While comparing the list of wikis from Wikidata with the list of dumped
> projects I found out that the following wikis are currently not being
> dumped:
>
> - alswikibooks (last dump 20180101)
> - alswikiquote (last dump 20180101)
> - alswiktionary (last dump 20180101)
> - ecwikimedia (never dumped, private but not marked private in
> Wikidata?)
> - fixcopyrightwiki (last dump 20200220)
> - labswiki (never dumped?)
> - labtestwiki (never dumped?)
> - mowiki (last dump 20180101)
> - mowiktionary (last dump 20180101)
> - ru_sibwiki (last dump 20071011)
> - ukwikiversity (never dumped?)
>
> Is there an uptodate machine-readable list of currently dumped wikis
> besides
https://dumps.wikimedia.org/backup-index.html?
>
> (Off-topic) Spoiler for dump searching tool on my laptop:
> $ target/release/wdgrep "asdfdefased"
> /c/Users/xyz/wpdumps/dewiki-20200701-pages-articles-multistream.xml -v
> --ns
> 0
> Searched 21437.064 MiB in 8.467969 seconds (2531.5474 MiB/s).
>
> Best regards,
>
> Count Count
>