Hi all,
I am helping the charity Volunteer Uganda set up an offline eLearning
computer system with 15 Raspberry Pi's and cheap desktop computer for a
server. Server stats:
- 2TB disk
- 8GB DDR3 ram
- 3ghz i5 quad core.
I am trying to import enwiki-20130403-pages-articles-multistream.xml.bz2
using mwdumper-1.16.jar, but I have a few questions.
1. I was originally using a GUI version of mwdumper-1.16.jar, but that
errored out a few time with duplicate pages so I decided to use the
pre-built one recommended on the media wiki page. Having looked at the
stats on Wikipedia I can see that there are roughly 30 million pages,
however I have found this morning that mwdumper-1.16.jar has finished (no
errors) with roughtly 13.3 million pages. Without any errors I assumed that
it had finished, but I appear to be 17 million pages short?
2. The pages that have imported are missing templates. Is there another
XML file that I can import which will add the missing templates? As the
screen shot below shows, it is almost unreadable without them.
Many thanks in advance for your help.
Kind regards,
Richard Ive
[image: Inline images 2]
--
Richard Ive • Metafour UK Ltd • 2 Berghem Mews, London W14 0HN •
registered in England: 01528556
tel: +4420 7912 2000 • direct: +4420 7912 2006 • mobile: +447854
569 205 • website:
www.metafour.com
This email is private & confidential; if you received it in error, please
notify us and delete it from your system