I'd much rather we scrub the data completely as that's the plan that we put forward. In general, scrubbing the data from the DB, but not the raw logs serves no useful purpose.
On Wed, Jul 30, 2014 at 3:55 AM, Christian Aistleitner < christian@quelltextlich.at> wrote:
Hi Steven,
On Tue, Jul 29, 2014 at 11:56:31AM -0700, Steven Walling wrote:
Growth team needs some data removed from [...] the raw logs [...]
Thanks for caring to clean up no longer needed data. It's greatly appreciated.
However, we typically do not scrub or clean the raw logs [1].
People are using those files for debugging and pushed back when we asked about whether we should clean them up.
Those raw files are only available to a limited set of people, so it is typically less of an issue.
Is it ok to just remove the data from databases (Thanks to Sean!) and let the data sit on the raw logs, or is there a hard requirement to scrub the raw logs clean too?
Have fun, Christian
[1] See 'Raw client and server side log files' item in http://lists.wikimedia.org/pipermail/analytics/2014-June/002256.html
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics