Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20240201 full revision history content run.
We are currently dumping 982 projects in total.
---------------------
Stats for itwikisource on date 20240201
Total size of page content dump files for articles, current content only:
1,885,904,585
Total size of page content dump files for all pages, current content only:
1,959,855,466
Total size of page content dump files for all pages, all revisions:
20,508,962,333
---------------------
Stats for enwiki on date 20240201
Total size of page content dump files for articles, current content only:
98,402,833,845
Total size of page content dump files for all pages, current content only:
202,673,661,885
Total size of page content dump files for all pages, all revisions:
28,190,423,319,469
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector
Hey data dump friends! Sorry for the pseudo off-topic post but just wanted to let you know that the 2021.01.2 data dump is now located on the South Pole of the Moon, for up to 5 billion years, in digital form, as nickel DVD masters. Landed by my foundation - www.archmission.org - on the Intuitive Machines IM-1 mission. So we now have a pretty good offsite backup in place :)
Nova Spivack
Confidential
The information contained in this transmission may contain privileged and confidential information, protected by federal and state privacy laws. It is intended only for the use of the person(s) named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
DISCLAIMER: Sender is NOT a United States Securities Dealer or Broker or U.S. Investment adviser. Sender is a Consultant and makes no warranties or representations as to all members of the Transaction. This E-mail letter and the attached related documents are never to be considered a solicitation for any purpose in any form or content. Upon receipt of these documents, the Recipient hereby acknowledges this Disclaimer. If acknowledgement is not accepted, Recipient must return any and all documents in their original receipted condition to Sender. This electronic communication is covered by the Electronic Communications Privacy Act of 1986, Codified at 18 U.S.C 1367,2510-2521, 2701-2710, 3121-3126
Hello M.Srilasya,
The XML data dumps of all the Wikipedias are free to download and use as
per the licensing discussed here <https://dumps.wikimedia.org/legal.html>.
So you can just download anything you'd like from the website here:
https://dumps.wikimedia.org/backup-index.html.
If you let me know a specific language you're interested in, I can point
you to the exact download link. But since you asked for a smaller download,
let me offer simplewiki, which is a smaller English wiki that uses
"Simplified English'', yet it is big enough to be interesting to do proof
of concepts with:
All pages with complete page edit history (.bz2)
- simplewiki-20240201-pages-meta-history.xml.bz2
<https://dumps.wikimedia.org/simplewiki/20240201/simplewiki-20240201-pages-m…>
2.9
GB
-
All pages, current versions only.
- simplewiki-20240201-pages-meta-current.xml.bz2
<https://dumps.wikimedia.org/simplewiki/20240201/simplewiki-20240201-pages-m…>
356.7
MB
On Thu, Feb 22, 2024 at 1:10 AM 21131A0564 MANCHUKONDA SRILASYA <
21131a0564(a)gvpce.ac.in> wrote:
> Dear xmldatadumps owner,
> I'm a student working on a search engine project for which i
> need the xml data dumps. i do not have excess storage capabilities. so, I
> just need a small xml data dump. so that I can use it for my project.
> I will make sure that I will not misuse the data provided by
> you. please consider my request.
>
> Yours obediently,
> M.Srilasya
>
--
Xabriel J. Collazo Mojica (he/him, pronunciation
<https://commons.wikimedia.org/wiki/File:Xabriel_Collazo_Mojica_-_pronunciat…>
)
Sr Software Engineer
Wikimedia Foundation