[Analytics] s1-analytics-slave impressively slow queries

10 Nov 2014


      Three identical queries from the 'research_prod' user have just passed one
month execution time on s1-anlytics-slave:
select count(*)
from staging.ourvision r
where exists (
  select *
  from staging.ourvision r1
          inner join
       staging.ourvision r2
          on r2.sha1 = r1.sha1
    where r1.page_id = r.page_id
      and r2.page_id = r.page_id
      and DATE_ADD(r.timestamp, INTERVAL 1 HOUR)
      and r2.timestamp between r.timestamp and DATE_SUB(r.timestamp ,
INTERVAL 1 HOUR)
      and  r1.sha1!= r.sha1
);
I havn't checked to see if the queries are just that amazingly slow, or if
they're part of a larger ongoing transaction. In any case, three month-long
transactions is pushing the resource limits of the slave and will soon
result in either mass replication lag or some other interesting lockup that
may in turn take days to rollback :-)
Can we kill these? Can we optimize and/or redesign the jobs? Happy to
help...

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

[Analytics] s1-analytics-slave impressively slow queries