While chatting with various people about data retention the question of keeping the bz2 compressed files of pages-meta-history.xml vs their 7z equivalents came up.
I'm curious about the usage of bz2 vs. 7z for the full page history. If we can get 7za to not be a bottleneck for the build then would anyone be crushed if we dropped support for the bz2 version?
It would be a significant savings in space.
I know the initial decision to serve both was made at a time when the availability of 7zip for multiple OS's was questionable at best. Today there are supported releases for Windows and Linux (src) and a fragmented but active set of OSX ports.
Thoughts?
--tomasz
xmldatadumps-l@lists.wikimedia.org