The tofu logging is stopping - I already committed the code to stop it and it goes on the usual deployment train. I plan to run some analysis on it this week, and after that it can be discarded. I'll need a bit of help from Nuria with the analysis.
-- Amir Elisha Aharoni ። אָמִיר אֱלִישָׁע אַהֲרוֹנִי Language Engineering ። הַנְדָּסָה לְשׁוֹנִית Wikimedia Foundation ። קֶרֶן וִיקִימֶדְיָה
2014-07-03 7:57 GMT+03:00 Dario Taraborelli dtaraborelli@wikimedia.org:
I have the feeling there’s no need to keep 114Gb of raw client-side instrumentation data for tofu detection. Copying Amir, Gilles and Jon who are the respective owners of the schemas in Sean’s list.
On Jul 2, 2014, at 7:44 PM, Oliver Keyes okeyes@wikimedia.org wrote:
he odd name is frustrating to me too :/. I'd be interested to see if we need the MV tables (or, the really old data in them): as I understand it those are aggregated for public consumption fairly regularly.
On 2 July 2014 22:21, Sean Pringle springle@wikimedia.org wrote:
Hi :)
The following table is easily the largest in eventlogging and growing fastest:
114G UniversalLanguageSelector-tofu_7629564
Is there a plan for purging old data from this one? I realize it's mostly new data; just wondering if growth will be unbounded.
Why does it have an odd name "-tofu"? Is it intended?
There is a duplicate table called UniversalLanguageSelecTor-tofu_7629564 -- note the uppercase T -- with a single row. Is that needed?
The next biggest are:
67G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 33G MobileWebClickTracking_5929948.ibd
BR Sean
DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics