Just spoke with Jaime Crespo and he confirmed that:
- m4-master (master EL database) only holds events for the last 45
days to avoid space problems. That's for all tables including Echo.
- analytics-storage is the replica that keeps the historical data and
is meant to apply the specific purging strategy agreed in the schema's talk
page. This database does not have space problems (yet).
Sure, it doesn't have space problems, but the problem remains that with a
table
this large, it's impossible to query and get results in our
lifetime. So we need to come up with some better solutions where we have
these huge volumes of valuable data. I think in this case moving all of
the data to Hadoop and blacklisting it from the mysql inserter seems like
the right thing to do. The only reason for data to exist in mysql should
be if we're querying data on a frequent period basis and taking actions
based on the results of those queries. Otherwise it's a waste of resources
and we should allocate that disk space to something else.