I'd much rather we scrub the data completely as that's the plan that we put forward.  In general, scrubbing the data from the DB, but not the raw logs serves no useful purpose.  

-Aaron


On Wed, Jul 30, 2014 at 3:55 AM, Christian Aistleitner <christian@quelltextlich.at> wrote:
Hi Steven,

On Tue, Jul 29, 2014 at 11:56:31AM -0700, Steven Walling wrote:
> Growth team needs some data removed from [...] the raw logs [...]

Thanks for caring to clean up no longer needed data. It's greatly
appreciated.

However, we typically do not scrub or clean the raw logs [1].

People are using those files for debugging and pushed back when we
asked about whether we should clean them up.

Those raw files are only available to a limited set of people, so it
is typically less of an issue.

Is it ok to just remove the data from databases (Thanks to Sean!) and
let the data sit on the raw logs, or is there a hard requirement to
scrub the raw logs clean too?

Have fun,
Christian

[1] See 'Raw client and server side log files' item in
http://lists.wikimedia.org/pipermail/analytics/2014-June/002256.html


--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
                           Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3     Email:  christian@quelltextlich.at
4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81
                             Fax:            +43 7946 / 20 5 81
                             Homepage: http://quelltextlich.at/
---------------------------------------------------------------

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics