Re: [Analytics] purging old data from eventlogging db

30 May 2014

On Thu, May 29, 2014 at 11:03 PM, Sean Pringle &lt;springle(a)wikimedia.org&gt;
wrote:

...
  On Fri, May 30, 2014 at 3:28 PM, Ori Livneh
&lt;ori(a)wikimedia.org&gt; wrote:

  On Wed, May 28, 2014 at 11:26 PM, Steven Walling
&lt;swalling(a)wikimedia.org&gt;
 wrote:

  My main question is what the rationale is. Is it
to improve query
 performance on analytics dbs?

 I imagine it will help, but it's probably not the primary reason. I
 imagine Sean would like to have the database in a state of equilibrium such
 that there are no looming dangers, and no reason in principle why things
 couldn't just keep running. At the moment the clip of incoming events is
 prone to sharp fluctuations and there is no protocol in place for handling
 exhausted server capacity.

 Correct.

 It's not really about performance since the dataset will be larger than
 $memory regardless.

 Of course, if you guys decide that specific data needs to stay around for
 ever, that's fine; it helps with capacity planning and we just bite the
 bullet and ensure sufficient storage space is available. Having a default
 purge-after-X-months policy for new tables would be the baseline.

Thanks for the explanation guys. This makes perfect sense to me. I'd much
rather have old data be something we have to dig a little harder for, than
worry if current schemas are going to be accessible or not.

-- 
Steven Walling,
Product Manager
https://wikimediafoundation.org/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] purging old data from eventlogging db