On Wed, May 28, 2014 at 10:50 AM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
I just announced this potential change in Scrum of Scrums and the Mobile team said they also would like to keep old data, but not for all of their schemas.  They're cleaning up their graphs and we should check with them when we start deleting.

Following up on this from the Growth perspective...

My main question is what the rationale is. Is it to improve query performance on analytics dbs?

I do know there are many older schemas for Growth-related experiments that are only really useful for historical analysis, which is kind of hard to reconstruct anyway. If there are sound technical reasons to chuck stuff from the relational dbs and retain it only in the raw JSON logs, then I'm potentially okay with helping figure out a list of schemas to retain and schemas to purge. Aaron, thoughts?

Steven Walling,
Product Manager