Hi Sean,
On Thu, Jul 03, 2014 at 12:21:34PM +1000, Sean Pringle wrote:
The following table is easily the largest in eventlogging and growing fastest:
114G UniversalLanguageSelector-tofu_7629564
thanks for the heads up!
We are aware of UniversalLanguageSelector-tofu producing too much data since 2014-06-25 ([1], [2]), and Nuria is on it.
As I could not find a corresponding bug, I created one to track the issue at: https://bugzilla.wikimedia.org/show_bug.cgi?id=67463
Is there a plan for purging old data from this one?
Just to make expectations explicit: Since in a different part of this thread you are asking more for expected growth bounds, I assume that the table can stay at that size until discussion with Language about the way forward produced concrete next steps, and you do not expect us to prune data right away.
There is a duplicate table called UniversalLanguageSelecTor-tofu_7629564 -- note the uppercase T -- with a single row. Is that needed?
I noted that too when looking at the issue last week, but decided against calling it out, since it's just a single small table. I expect we see these artifacts from time to time. Do they get in the way somehow, or is it ok to just keep them around?
Thanks, Christian
[1] http://lists.wikimedia.org/pipermail/analytics/2014-June/002260.html [2] search for “tofu” on http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-analytics/20140625.txt