Dan, Thanks for reaching out.

18 months is enough for my use cases as long as the dumps capture the exact data structure.

Best,
Leila

--
Leila Zia
Senior Research Scientist
Wikimedia Foundation

On Fri, Jul 29, 2016 at 11:51 AM, Amir E. Aharoni <amir.aharoni@mail.huji.ac.il> wrote:

I am now checking traffic data every day to see whether Compact Language Links affect it. It makes sense to compare them not only to the previous week, but also to the same month previous year. So one year is not hardly enough. 18 months is better, and three years is much better because I'll be able to check also the same month in earlier years.

I imagine that this may be useful to all product managers that work on features that can affect traffic.


בתאריך 29 ביולי 2016 15:41,‏ "Dan Andreescu" <dandreescu@wikimedia.org> כתב:
Dear Pageview API consumers,

We would like to plan storage capacity for our pageview API cluster.  Right now, with a reliable RAID setup, we can keep 18 months of data.  If you'd like to query further back than that, you can download dump files (which we'll make easier to use with python utilities).

What do you think?  Will you need more than 18 months of data?  If so, we need to add more nodes when we get to that point, and that costs money, so we want to check if there is a real need for it.

Another option is to start degrading the resolution for older data (only keep weekly or monthly for data older than 1 year for example).  If you need more than 18 months, we'd love to hear your use case and something in the form of:

need daily resolution for 1 year
need weekly resolution for 2 years
need monthly resolution for 3 years

Thank you!

Dan

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics