Our pageview dumps were in the middle of a refactor when our team changed a
lot. We haven't been able to finish it, but we do actually have a
well-compressed version that we just haven't properly launched as a new
dataset. I'm working on prioritizing that.
On Sun, Sep 4, 2022 at 02:58 Gergő Tisza <gtisza(a)gmail.com> wrote:
I'd imagine the current format is optimized for
being able to output
hourly dumps (and thus reducing data latency and data processing costs),
not so much for storage space
_______________________________________________
Wikitech-l mailing list -- wikitech-l(a)lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave(a)lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/