don't know if this issue came up already - in case it did and has been
dismissed, I beg your pardon. In case it didn't...
I hereby propose, that pbzip2 (https://launchpad.net/pbzip2) is used
to compress the xml dumps instead of bzip2. Why? Because its sibling
(pbunzip2) has a bug bunzip2 hasn't. :-)
Strange? Read on.
A few hours ago, I filed a bug report for pbzip2 (see
https://bugs.launchpad.net/pbzip2/+bug/922804) together with some test
results done even some few hours before that.
The results indicate that:
bzip2 and pbzip2 are vice-versa compatible each one can create
archives, the other one can read. But if it is for uncomressing, only
pbzip2 compressed archives are good for pbunzip2.
I propose compressing the archives with pbzip2 for the following
1) If your archiving machines are SMP systems this could lead to a
better usage of system ressources (i.e. faster compression).
2) Compression with pbzip2 is harmless for regular users of bunzip2,
so everything should run for these people as usual.
3) pbzip2-compressed archives can be uncompressed with pbunzip2 with a
speedup that scales nearly linearly with the number of CPUs in the
So to sum up: It's a no loose and two win situation if you migrate to
pbzip2. And that just because pbunzip2 is slightly buggy. Isn't that
Dipl.-Inf. Univ. Richard C. Jelinek
PetaMem GmbH - www.petamem.com Geschäftsführer: Richard Jelinek
Human Language Technology Experts Sitz der Gesellschaft: Fürth
69216618 Mind Units Registergericht: AG Fürth, HRB-9201
To whom if may concern,
I am trying to download image dump tarballs from <ftpmirror.your.org>. This
site seems to be down. I can reach it with `ping', but not with `wget'. Has
anyone else noticed this?
On Sat, May 9, 2015 at 3:51 PM, Bryan White <bgwhite(a)gmail.com> wrote:
> Thank you for looking into this. Could somebody please report back on the
> dumps mailing list so others will know what is going on. Knowing that
> somebody is looking into the problem helps out alot. Also helps me inform
> the people complaining to me.
Understood. I'm not the person working on this, but let's try to
simply CC the dumps list on this thread for now.
> Hmmm, People complaining to me, I complain to labs, labs reports to
> Ariel.... Is that a case of $*@# rolling uphill or downhill? :)
To correct the timeline (apologies if my summary was too brief for
those who didn't follow the link to Phabricator): As mentioned in the
ticket, Ariel had already been aware of this issue since Wednesday,
and filed the ticket before it was mentioned on either the dumps
mailing list or this list.
> On Sat, May 9, 2015 at 4:35 PM, Tilman Bayer <tbayer(a)wikimedia.org> wrote:
>> Ariel is working on it, see https://phabricator.wikimedia.org/T98585 .
>> On Sat, May 9, 2015 at 3:23 PM, Yuvi Panda <yuvipanda(a)gmail.com> wrote:
>> > apergos is the person to poke (cc'd)
>> > On May 9, 2015 3:10 PM, "Bryan White" <bgwhite(a)gmail.com> wrote:
>> >> No new dumps have been started in a week and only one dump has started
>> >> in
>> >> May. See https://dumps.wikimedia.org/backup-index.html
>> >> Messages have been sent to the dumps mailing list, but there has been
>> >> no
>> >> response. Does anybody know who to contact about getting the dumps
>> >> moving
>> >> again?
>> >> Bryan
>> >> _______________________________________________
>> >> Labs-l mailing list
>> >> Labs-l(a)lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/labs-l
>> > _______________________________________________
>> > Labs-l mailing list
>> > Labs-l(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/labs-l
>> Tilman Bayer
>> Senior Analyst
>> Wikimedia Foundation
>> IRC (Freenode): HaeB
>> Labs-l mailing list
> Labs-l mailing list
IRC (Freenode): HaeB
Sent from my Samsung device
-------- Original message --------
Date: 09/05/2015 8:00 PM (GMT+08:00)
Subject: Xmldatadumps-l Digest, Vol 61, Issue 2
Send Xmldatadumps-l mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Xmldatadumps-l digest..."
1. May dumps dead (Bryan White)
Date: Fri, 8 May 2015 17:17:49 -0600
From: Bryan White <bgwhite(a)gmail.com>
Subject: [Xmldatadumps-l] May dumps dead
Content-Type: text/plain; charset="utf-8"
Any reason only a couple of dumps have fired off in May and currently only
two dumps running?
It seems that this month dumps creation is very slow. As on 2015-05-08
08:08:40 only 2 wikipedia dumps are completed and 2 in process.
Could somebody check please?
Alex Druk, PhD
(775) 237-8550 Google voice