Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20190501 full revision history content run.
We are currently dumping 918 projects in total.
---------------------
Stats for kkwiki on date 20190501
Total size of page content dump files for articles, current content only:
1204342649
Total size of page content dump files for all pages, current content only:
1362466727
Total size of page content dump files for all pages, all revisions:
15360357338
---------------------
Stats for enwiki on date 20190501
Total size of page content dump files for articles, current content only:
72069178011
Total size of page content dump files for all pages, current content only:
160946760989
Total size of page content dump files for all pages, all revisions:
19000924441816
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector
Hello,
I am doing my Master thesis in Germany and i want the Wikipedia database of
March 26,2006.
Could you please tell me where i can find that data ?
Best regards,
Muhammad Ali
Hi,
I would be interested to know how many pages in
enwiki-latest-pages-articles.xml . My own count gives 19,4 Mio. pages.
Can this be, at least roughly, confirmed?
In the internet I just find these numbers:
5,861,178 - I guess this are all namespace 0 pages
47,826,337 - this are all pages in all namespaces
Sigbert
--
https://hu.berlin/skhttps://hu.berlin/mmstat3
Hi,
I've recently taken interest in the Wikipedia data dumps. I'd like to
download a subset of files when they are updated. On the Data dumps
page[1] a monitoring file[2] is mentioned, but the file doesn't contain
any data (except the "wiki" object).
I did some research and found the monitor.py script and some info in the
relevant README [3]. If I've understood it correctly, a server will
periodically run monitor.py which will create the index.json file.
Is this deployed now? Since the file exists with (very little) content
I'd guess that monitor.py has been run.
[1] https://meta.wikimedia.org/wiki/Data_dumps#Monitoring_dump_generation
[2] https://dumps.wikimedia.org/index.json
[3]
https://phabricator.wikimedia.org/source/operations-dumps/browse/master/xml…
Regards
Aron Bergman
Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20190401 full revision history content run.
We are currently dumping 918 projects in total.
---------------------
Stats for ruwiki on date 20190401
Total size of page content dump files for articles, current content only:
21223127433
Total size of page content dump files for all pages, current content only:
26717658778
Total size of page content dump files for all pages, all revisions:
2662395674167
---------------------
Stats for enwiki on date 20190401
Total size of page content dump files for articles, current content only:
71723968674
Total size of page content dump files for all pages, current content only:
160245260388
Total size of page content dump files for all pages, all revisions:
18880938139465
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector