If you use recompressxml in the mwbzutils package, as of version 0.0.9 (just deployed) it no longer writes bz2 compressed data by default to stdout; instead it relies on the extension of the output file and will write either gzipped, bz2 or plain text output, accordingly. This means that if it is directed to write to stdout, this will be uncompressed data.
You can work around this in your scripts by piping the text from stdout to bzip directly from recompressxml.
This change came as part of some speedup work. I won't discuss that more until we see how the next couple of runs go.
Thanks for your understanding.
Ariel
xmldatadumps-l@lists.wikimedia.org