[openZIM dev-l] Out of memory issue in mirrorMediawikiPages.pl - Offline-l

28 Sep 2009


      Hello again,
In trying to retrieve the images for the Hebrew Wikipedia ZIM I'm making, I
tried running Emmanuel's script *mirrorMediawikiPages.pl*.  My command line
was this:
*./mirrorMediawikiPages.pl
--sourceHost=he.wikipedia.org--destinationHost=localhost
--useIncompletePagesAsInput --sourcePath=w
*
After working for more than 20 hours, and still in the stage of populating
the @pages with incomplete pages, it aborted with "out of memory".  The
machine has 4GB physical memory, and the last time I checked -- several
hours before it aborted -- the script was consuming 3.6GB.
Is there a way to do this in several large chunks, without specifying each
individual page?  How do you do it?
Thanks in advance,
Asaf Bartov
    Wikimedia Israel
--
Asaf Bartov asaf@forum2.org