Hi all,
I am helping the charity Volunteer Uganda set up an offline eLearning computer system with 15 Raspberry Pi's and cheap desktop computer for a server. Server stats:
- 2TB disk
- 8GB DDR3 ram
- 3ghz i5 quad core.
I am trying to import enwiki-20130403-pages-articles-multistream.xml.bz2 using mwdumper-1.16.jar, but I have a few questions.
- I was originally using a GUI version of mwdumper-1.16.jar, but that errored out a few time with duplicate pages so I decided to use the pre-built one recommended on the media wiki page. Having looked at the stats on Wikipedia I can see that there are roughly 30 million pages, however I have found this morning that mwdumper-1.16.jar has finished (no errors) with roughtly 13.3 million pages. Without any errors I assumed that it had finished, but I appear to be 17 million pages short?
- The pages that have imported are missing templates. Is there another XML file that I can import which will add the missing templates? As the screen shot below shows, it is almost unreadable without them.
Many thanks in advance for your help.
Kind regards,
Richard Ive
--
Richard Ive Metafour UK Ltd 2 Berghem Mews, London W14 0HN registered in England: 01528556
tel: +4420 7912 2000 direct: +4420 7912 2006 mobile: +447854 569 205 website: www.metafour.com
This email is private & confidential; if you received it in error, please notify us and delete it from your system