Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20190301 full revision history content run.
We are currently dumping 917 projects in total.
---------------------
Stats for eswikivoyage on date 20190301
Total size of page content dump files for articles, current content only:
29272320
Total size of page content dump files for all pages, current content only:
37816923
Total size of page content dump files for all pages, all revisions:
1591275166
---------------------
Stats for enwiki on date 20190301
Total size of page content dump files for articles, current content only:
71290510732
Total size of page content dump files for all pages, current content only:
159230276934
Total size of page content dump files for all pages, all revisions:
18730978850438
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector
Samuel Hoover, 20/03/19 21:24:
> Does that mean Commons is currently culling its content? and that it
> makes most sense to wait for a post 2016 dump until after housecleaning
> is complete?
No, I just mean that it takes time to identify copyright violations and
so on. Most deletions happen for content uploaded in the last few
months, so I generally respected an "embargo" of at least six months
(hte Internet Archive items are supposed to be durable).
>
> Even with fiber connection/torrents, downloading will take time. Does
> any organization sell terabyte drives containing the Commons dump? Or
> can one travel to a physical location and connect several terabyte
> drives to quickly copy over?
Your best chance is probably to find some machines connected to the
CENIC/Internet2 network or "nearby" and download the Internet Archive
torrents from there. Hopefully you get 5-10 MiB/s per item and if you do
all of them concurrently you should manage in a day or two. Internet
Archive also routinely provides researcher access, but I'm not sure
whether that's for private items only.
Wikimedia Foundation used to provide data feeds for some companies back
in the day. If there is a significant need I suppose they could arrange
for someone to have rsync access or something, but it's not going to
happen overnight.
Federico
Hello everyone,
Are Wikipedia or Commons image dumps available anywhere? I found this
one below but it stops at 2012.
/http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/20121201//
Hello!
Do you have wiki data available for download in 2012 and 2014? There is no download link found in https://dumps.wikimedia.org/. If yes, can you provide a download link? Thank you very much.
Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20181101 full revision history content run.
We are currently dumping 916 projects in total.
---------------------
Stats for bnwikivoyage on date 20181101
Total size of page content dump files for articles, current content only:
6587638
Total size of page content dump files for all pages, current content only:
6825818
Total size of page content dump files for all pages, all revisions:
99741838
---------------------
Stats for enwiki on date 20181101
Total size of page content dump files for articles, current content only:
69650333728
Total size of page content dump files for all pages, current content only:
155512399552
Total size of page content dump files for all pages, all revisions:
18210409893326
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector
Those of you watching the xml/sql dumps run this month may have noticed
some dump failures today. These were caused by depooling of the database
server for maintenance while the dump hosts were querying it. The jobs in
question should be rerun automatically over the next few days, and I'll be
keeping an eye on things.
Ariel
Greetings XML Dump users and contributors!
Looks like https://wikimedia.bytemark.co.uk/ is not updated from
2017-11-26. I think, maybe somebody should delete it from mirror list or
contact bytemark notify them?
Best regards,
Mariusz "Nikow" Klinikowski.
Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20190201 full revision history content run.
We are currently dumping 917 projects in total.
---------------------
Stats for angwikibooks on date 20190201
Total size of page content dump files for articles, current content only:
1377805
Total size of page content dump files for all pages, current content only:
1929910
Total size of page content dump files for all pages, all revisions:
6492585
---------------------
Stats for enwiki on date 20190201
Total size of page content dump files for articles, current content only:
70922565540
Total size of page content dump files for all pages, current content only:
158336597712
Total size of page content dump files for all pages, all revisions:
18607622666061
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector