Our pageview dumps were in the middle of a refactor when our team changed a lot.  We haven't been able to finish it, but we do actually have a well-compressed version that we just haven't properly launched as a new dataset.  I'm working on prioritizing that.

On Sun, Sep 4, 2022 at 02:58 Gergő Tisza <gtisza@gmail.com> wrote:
I'd imagine the current format is optimized for being able to output hourly dumps (and thus reducing data latency and data processing costs), not so much for storage space
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/