Hi all,
I am helping the charity Volunteer Uganda set up an offline eLearning computer system with 15 Raspberry Pi's and cheap desktop computer for a server. Server stats:
- 2TB disk - 8GB DDR3 ram - 3ghz i5 quad core.
I am trying to import enwiki-20130403-pages-articles-multistream.xml.bz2 using mwdumper-1.16.jar, but I have a few questions.
1. I was originally using a GUI version of mwdumper-1.16.jar, but that errored out a few time with duplicate pages so I decided to use the pre-built one recommended on the media wiki page. Having looked at the stats on Wikipedia I can see that there are roughly 30 million pages, however I have found this morning that mwdumper-1.16.jar has finished (no errors) with roughtly 13.3 million pages. Without any errors I assumed that it had finished, but I appear to be 17 million pages short? 2. The pages that have imported are missing templates. Is there another XML file that I can import which will add the missing templates? As the screen shot below shows, it is almost unreadable without them.
Many thanks in advance for your help.
Kind regards, Richard Ive
[image: Inline images 2]