I have the feeling there’s no need to keep 114Gb of raw client-side instrumentation data for tofu detection.Copying Amir, Gilles and Jon who are the respective owners of the schemas in Sean’s list.On Jul 2, 2014, at 7:44 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:he odd name is frustrating to me too :/. I'd be interested to see if we need the MV tables (or, the really old data in them): as I understand it those are aggregated for public consumption fairly regularly.
_______________________________________________On 2 July 2014 22:21, Sean Pringle <springle@wikimedia.org> wrote:
Hi :)
The following table is easily the largest in eventlogging and growing fastest:
114G UniversalLanguageSelector-tofu_7629564
Is there a plan for purging old data from this one? I realize it's mostly new data; just wondering if growth will be unbounded.
Why does it have an odd name "-tofu"? Is it intended?There is a duplicate table called UniversalLanguageSelecTor-tofu_7629564 -- note the uppercase T -- with a single row. Is that needed?
The next biggest are:
67G PageContentSaveComplete_5588433.ibd
61G MediaViewer_8572637.ibd
57G MediaViewer_8245578.ibd
33G MobileWebClickTracking_5929948.ibd
BRSean
---DBA @ WMF
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--Oliver Keyes
Research Analyst
Wikimedia Foundation
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics