Hi, I want to download the wikipedia database dump of English version. But the whole database dump is 10.1GB, which is too large for me. In fact, I only need a part of the database, and any part is ok for me. Can I download a small database, which is the subset of the whole database dump ?
Thanks for your time and help!
Best Wishes!
If you can download the whole file to your PC, then you can just import a portion of it and stop the import after some time. The mwDumper shows you the imported pages in an increment of 1000.
If you do not have enough bandwidth to download the whole thing, you can use the Special:Export (http://en.wikipedia.org/wiki/Special:Export) feature on the English Wikipedia and then you select the pages you need to download.
bilal -- Verily, with hardship comes ease.
On Mon, Jan 11, 2010 at 11:10 AM, OrzzrO orzvsorz@gmail.com wrote:
Hi, I want to download the wikipedia database dump of English version. But the whole database dump is 10.1GB, which is too large for me. In fact, I only need a part of the database, and any part is ok for me. Can I download a small database, which is the subset of the whole database dump ?
Thanks for your time and help!
Best Wishes!
OrzzrO wrote:
Hi, I want to download the wikipedia database dump of English version. But the whole database dump is 10.1GB, which is too large for me. In fact, I only need a part of the database, and any part is ok for me. Can I download a small database, which is the subset of the whole database dump ?
The "Articles, templates, image descriptions, and primary meta-pages" (pages-articles.xml.bz2) is only about half the size (5.3 GB) of the full dump. Also, you might try just downloading part of the dump file: I'd expect most tools to be able to process a truncated dump, even if they might complain about an unexpected end of file.
OrzzrO wrote:
Hi, I want to download the wikipedia database dump of English version. But the whole database dump is 10.1GB, which is too large for me. In fact, I only need a part of the database, and any part is ok for me. Can I download a small database, which is the subset of the whole database dump ?
Thanks for your time and help! Best Wishes!
I answered you yesterday on mediawiki-l, explaining how to download just the articles of a category. http://lists.wikimedia.org/pipermail/mediawiki-l/2010-January/032983.html
wikitech-l@lists.wikimedia.org