These jobs are currently written uncompressed. Starting with the next run, I plan to write these as gzip compressed files. This means that we'll save a lot of space for the larger abstracts dumps. Additionally,only status and html files will be uncompressed, which is convenient for maintenance reasons.
If anyone has a strong objection to this, please raise it now. There's a ticket open for it: https://phabricator.wikimedia.org/T178046
Thanks!
Ariel
xmldatadumps-l@lists.wikimedia.org