don't know if this issue came up already - in case it did and has been
dismissed, I beg your pardon. In case it didn't...
I hereby propose, that pbzip2 (https://launchpad.net/pbzip2) is used
to compress the xml dumps instead of bzip2. Why? Because its sibling
(pbunzip2) has a bug bunzip2 hasn't. :-)
Strange? Read on.
A few hours ago, I filed a bug report for pbzip2 (see
https://bugs.launchpad.net/pbzip2/+bug/922804) together with some test
results done even some few hours before that.
The results indicate that:
bzip2 and pbzip2 are vice-versa compatible each one can create
archives, the other one can read. But if it is for uncomressing, only
pbzip2 compressed archives are good for pbunzip2.
I propose compressing the archives with pbzip2 for the following
1) If your archiving machines are SMP systems this could lead to a
better usage of system ressources (i.e. faster compression).
2) Compression with pbzip2 is harmless for regular users of bunzip2,
so everything should run for these people as usual.
3) pbzip2-compressed archives can be uncompressed with pbunzip2 with a
speedup that scales nearly linearly with the number of CPUs in the
So to sum up: It's a no loose and two win situation if you migrate to
pbzip2. And that just because pbunzip2 is slightly buggy. Isn't that
Dipl.-Inf. Univ. Richard C. Jelinek
PetaMem GmbH - www.petamem.com Geschäftsführer: Richard Jelinek
Human Language Technology Experts Sitz der Gesellschaft: Fürth
69216618 Mind Units Registergericht: AG Fürth, HRB-9201
When I compare the following two commands
(shell)$ curl --ipv4 --verbose
(shell)$ curl --ipv6 --verbose
The first reports "HTTP/1.1 301 Moved Permanently"
The second reports "HTTP/1.1 200 OK"
Has anyone else noticed this?
pub 1024D/359E5142 2008-09-01 GPG key available on pgpkeys.mit.edu
Key fingerprint = 8D4F 4485 7F7D 5406 230C 9749 B821 2572 359E 5142
uid Dr. Kent L. Miller <kent.l.miller(a)alumni.cmu.edu>
Yannick Guigui, 08/11/2013 12:22:
> I'm camerounian I built a webapp whose allows students to consult
> wikipedia articles
So you don't need the originals, only thumbs.
> without internet connectivity,many school accepted
> the application and the application is hosted on a server and shared by
> wifi on each school.
Nice! It seems you may want to use this existing software solution:
That way, you can use the available ZIM files with no need to generate
or download (and compress) the thumbnails yourself.
Your help developing the software would be very useful and you could
avoid doing yourself what you don't have the resources (bandwidth) to do.
> I have all the other dumps of wikipedia articles in
> french and english;but I don't have any image because they are too heavy
> for me to be downloaded to my side (3 TB) and I have a low bandwidth (40
> ko/s when it's fast).
> The webapp works on a browser and i don't know if the zim format can be
> undecompressed to get small images (jpeg,png,svg...).
Kiwix is a browser, you can save anything you want AFAIK.
> This is the video demo (3min in french) of the webapp
> If I get small images in french and english to download to the app,my
> problem will revolved.
> Tank a lot Federico
> Le vendredi 8 novembre 2013, Federico Leva (Nemo) a écrit :
> Yannick Guigui, 08/11/2013 10:11:
> Please I want to get all images of wikipedia frensh and English,
> I much
> did it cost to book it on hardisk? In can't download it because
> I don't
> have enought bandwidth from my country.
> What do you need them for?
> Originals would be about 2+1 TB and anyone can download and ship
> them for you:
> Otherwise there are the ZIM files with thumbnails compressed,
> fr.wiki is 14 GB but en.wiki is not available yet.
Please I want to get all images of wikipedia frensh and English, I much did
it cost to book it on hardisk? In can't download it because I don't have
enought bandwidth from my country.
Tank's I really need it.