Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
-Aaron
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote: Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough heatmap
of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason?
On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk mpinchuk@wikimedia.org wrote:
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough
heatmap of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
Oh yeah, that'd be fine :)
On Tue, Sep 30, 2014 at 10:38 AM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason?
On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk mpinchuk@wikimedia.org wrote:
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough
heatmap of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
On Tue, Sep 30, 2014 at 7:45 PM, Maryana Pinchuk mpinchuk@wikimedia.org wrote:
Oh yeah, that'd be fine :)
On Tue, Sep 30, 2014 at 10:38 AM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason?
On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk <mpinchuk@wikimedia.org
wrote:
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough
heatmap of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Should we be the ones taking care of it? I'm not sure that the DB
credentials I currently have can delete content. Neither the ones we have. In the absence of a regular cleanup process (which is on our team to do) i think we just have to request Sean Pringle to delete the data.
If anyone knows best please correct me.
On Thu, Oct 2, 2014 at 9:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
On Tue, Sep 30, 2014 at 7:45 PM, Maryana Pinchuk mpinchuk@wikimedia.org wrote:
Oh yeah, that'd be fine :)
On Tue, Sep 30, 2014 at 10:38 AM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason?
On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk < mpinchuk@wikimedia.org> wrote:
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough
heatmap of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Nuria,
[ re-formatting to make quotation marks match ]
On Thu, Oct 02, 2014 at 01:20:45PM -0700, Nuria Ruiz wrote:
On Thu, Oct 2, 2014 at 9:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
Neither the ones we have.
I don't think that this holds true.
You can leverage your vanadium powers :-)
On vanadium, running
grep ^mysql /etc/eventlogging.d/consumers/mysql-m2-master
should give you all you need.
Have fun, Christian
On Fri, Oct 3, 2014 at 2:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
We can automate purging using the MariaDB using the Event Scheduler[1] if you guys want a once-off-set-and-forget solution. Eg:
CREATE TABLE purge_schedule ( table_name varchar(100) NOT NULL, days tinyint(3) unsigned NOT NULL );
Then for each EL table you would do:
INSERT INTO purge_schedule VALUES ('MultimediaTiming_7193302', 30);
The rest would be left to me, or rather, to a couple of stored procedures :-)
[1] Basically a cron that runs stored procedures: https://mariadb.com/kb/en/mariadb/documentation/stored-programs-and-views/st...
Sounds great, Sean! The following tables can be set to keeping 40 days of data:
MediaViewer_6054199 MediaViewer_6055641 MediaViewer_6066908 MediaViewer_6636420 MediaViewer_7670440 MediaViewer_8245578 MediaViewer_8572637 MediaViewer_8935662 MediaViewer_9792855 MediaViewer_9989959 MultimediaViewerAttribution_9758179 MultimediaViewerDimensions_10014238 MultimediaViewerDuration_8318615 MultimediaViewerDuration_8572641 MultimediaViewerNetworkPerformance_7393226 MultimediaViewerNetworkPerformance_7488625 MultimediaViewerNetworkPerformance_7917896
There's a good chance that some of the older ones will end up being empty, in which case they can be safely dropped.
On Mon, Oct 6, 2014 at 5:22 PM, Sean Pringle springle@wikimedia.org wrote:
On Fri, Oct 3, 2014 at 2:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
We can automate purging using the MariaDB using the Event Scheduler[1] if you guys want a once-off-set-and-forget solution. Eg:
CREATE TABLE purge_schedule ( table_name varchar(100) NOT NULL, days tinyint(3) unsigned NOT NULL );
Then for each EL table you would do:
INSERT INTO purge_schedule VALUES ('MultimediaTiming_7193302', 30);
The rest would be left to me, or rather, to a couple of stored procedures :-)
[1] Basically a cron that runs stored procedures: https://mariadb.com/kb/en/mariadb/documentation/stored-programs-and-views/st...
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Sean,
I made a spreadsheet to help track what has been requested. https://docs.google.com/a/wikimedia.org/spreadsheets/d/1RAhDbppfWDQsUXXr7r_5...
Let us know if you need more information before before you can start deleting old records.
On Wed, Oct 8, 2014 at 6:49 AM, Gilles Dubuc gilles@wikimedia.org wrote:
Sounds great, Sean! The following tables can be set to keeping 40 days of data:
MediaViewer_6054199 MediaViewer_6055641 MediaViewer_6066908 MediaViewer_6636420 MediaViewer_7670440 MediaViewer_8245578 MediaViewer_8572637 MediaViewer_8935662 MediaViewer_9792855 MediaViewer_9989959 MultimediaViewerAttribution_9758179 MultimediaViewerDimensions_10014238 MultimediaViewerDuration_8318615 MultimediaViewerDuration_8572641 MultimediaViewerNetworkPerformance_7393226 MultimediaViewerNetworkPerformance_7488625 MultimediaViewerNetworkPerformance_7917896
There's a good chance that some of the older ones will end up being empty, in which case they can be safely dropped.
On Mon, Oct 6, 2014 at 5:22 PM, Sean Pringle springle@wikimedia.org wrote:
On Fri, Oct 3, 2014 at 2:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
We can automate purging using the MariaDB using the Event Scheduler[1] if you guys want a once-off-set-and-forget solution. Eg:
CREATE TABLE purge_schedule ( table_name varchar(100) NOT NULL, days tinyint(3) unsigned NOT NULL );
Then for each EL table you would do:
INSERT INTO purge_schedule VALUES ('MultimediaTiming_7193302', 30);
The rest would be left to me, or rather, to a couple of stored procedures :-)
[1] Basically a cron that runs stored procedures: https://mariadb.com/kb/en/mariadb/documentation/stored-programs-and-views/st...
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Thu, Oct 9, 2014 at 2:23 AM, Kevin Leduc kevin@wikimedia.org wrote:
Sean,
I made a spreadsheet to help track what has been requested.
https://docs.google.com/a/wikimedia.org/spreadsheets/d/1RAhDbppfWDQsUXXr7r_5...
Let us know if you need more information before before you can start deleting old records.
Great, thanks. That is exactly the info I need. Will start the process and report back.
We can automate purging using the MariaDB using the Event Scheduler[1] if
you guys want a once-off-set-and-forget solution. Eg
This sounds great for all the tables discussed on the thread. Is easy to add tables to that procedure?
On Mon, Oct 6, 2014 at 8:22 AM, Sean Pringle springle@wikimedia.org wrote:
On Fri, Oct 3, 2014 at 2:28 AM, Gilles Dubuc gilles@wikimedia.org wrote:
We can trim down our team (multimedia)'s tables considerably by getting rid of data older than 30 days. This could even be done by a daily cron. How would we go about doing that? Should we be the ones taking care of it? I'm not sure that the DB credentials I currently have can delete content.
We can automate purging using the MariaDB using the Event Scheduler[1] if you guys want a once-off-set-and-forget solution. Eg:
CREATE TABLE purge_schedule ( table_name varchar(100) NOT NULL, days tinyint(3) unsigned NOT NULL );
Then for each EL table you would do:
INSERT INTO purge_schedule VALUES ('MultimediaTiming_7193302', 30);
The rest would be left to me, or rather, to a couple of stored procedures :-)
[1] Basically a cron that runs stored procedures: https://mariadb.com/kb/en/mariadb/documentation/stored-programs-and-views/st...
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Thu, Oct 9, 2014 at 8:25 AM, Nuria Ruiz nuria@wikimedia.org wrote:
We can automate purging using the MariaDB using the Event Scheduler[1]
if you guys want a once-off-set-and-forget solution. Eg
This sounds great for all the tables discussed on the thread. Is easy to add tables to that procedure?
Yes, just add them to the spreadsheet Kevin started, and/or insert them into the purge_schedule table. The procedures will not need to be reloaded each time a table is added (or removed).
Maryana,
What about deleting old records from MobileWikiAppToCInteraction? Do we want to treat the same as MobileWebClickTracking? (i.e. delete records before 2014)
On Tue, Sep 30, 2014 at 10:45 AM, Maryana Pinchuk mpinchuk@wikimedia.org wrote:
Oh yeah, that'd be fine :)
On Tue, Sep 30, 2014 at 10:38 AM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason?
On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk <mpinchuk@wikimedia.org
wrote:
On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Sep 27, 2014, at 11:42 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I'm not surprised that PageContentSaveComplete is big. That's a very useful table and it sees a lot of rows for good reason (every revision saved on every wiki).
As for the Multimedia/Mediaviewer tables, we should probably ping someone on that team to discuss them.
Dario, can you speak for the MobileWebClickTracking and MobileWikiAppToCInteraction schemas?
The mobile web team uses the MobileWebClickTracking to get a rough
heatmap of taps on prominent UI elements, and the apps team uses MobileWikiAppToCInteraction to measure engagement with the table of contents on the Wikipedia app. They're both not primary metrics we're tracking but are useful to check in on every once in awhile. Does that answer your question?
neither I nor Oliver are using this data but it’s used for some Limn dashboards by the Mobile team. Copying Maryana and Kaldari so they can chime in
D
On Sat, Sep 27, 2014 at 2:02 PM, Sean Pringle springle@wikimedia.org wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
-- Maryana Pinchuk Product Manager, Wikimedia Foundation wikimediafoundation.org
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Kevin,
some 10 days ago, Sean sent a nice heads up with sizes of EventLogging tables and a question around purging of data (see below, and [1]).
I am reluctant to chime in there, as there was some talk about adding EventLogging data purging to our Goals, and I do not want to interfere with resourcing, or communication of Analytics Goals.
However, I cannot see you chiming in there.
If there is no buy-in or push-back from you by Friday, I consider it a buy in, and start working towards at least trimming the biggest few tables and having the necessary conversations around that.
Have fun, Christian
[1] September part at: https://lists.wikimedia.org/pipermail/analytics/2014-September/002519.html
October part at: https://lists.wikimedia.org/pipermail/analytics/2014-October/002535.html
On Sun, Sep 28, 2014 at 04:02:39AM +1000, Sean Pringle wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Kevin,
thanks for chiming in!
Have fun, Christian
On Tue, Oct 07, 2014 at 05:05:25PM +0200, quelltextlich e.U. - Christian Aistleitner wrote:
Hi Kevin,
some 10 days ago, Sean sent a nice heads up with sizes of EventLogging tables and a question around purging of data (see below, and [1]).
I am reluctant to chime in there, as there was some talk about adding EventLogging data purging to our Goals, and I do not want to interfere with resourcing, or communication of Analytics Goals.
However, I cannot see you chiming in there.
If there is no buy-in or push-back from you by Friday, I consider it a buy in, and start working towards at least trimming the biggest few tables and having the necessary conversations around that.
Have fun, Christian
[1] September part at: https://lists.wikimedia.org/pipermail/analytics/2014-September/002519.html
October part at: https://lists.wikimedia.org/pipermail/analytics/2014-October/002535.html
On Sun, Sep 28, 2014 at 04:02:39AM +1000, Sean Pringle wrote:
Hi :-)
These are the largest Eventlogging tables on m2-master:
145G MobileWebClickTracking_5929948.ibd 94G PageContentSaveComplete_5588433.ibd 61G MediaViewer_8572637.ibd 57G MediaViewer_8245578.ibd 30G MultimediaViewerNetworkPerformance_7917896.ibd 29G MediaViewer_8935662.ibd 24G MobileWikiAppToCInteraction_8461467.ibd
Are these sizes roughly expected?
Anything we can discard or reduce?
Where did the discussion on purging data end up?
No immediate problems here, just rattling cages :-)
BR /s
-- DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics