Sean, fully agreed this is a problem.  The way I see it, I don't think there's an easy solution.  This query is made with a template that fills in different versions of an Event Logging schema.  This is bad for more than just this performance reason:

* when a new schema revision is implemented, either dozens of queries have to be updated or we have to live with this crappy performance
* when an old schema revision table is deleted, queries break regardless of approach
* when querying ad-hoc, this is yet another stumbling block to getting at the data

We are currently focused on moving Event Logging to Kafka, and I think that the way we handle Event Logging analytics will change too much too soon for us to address this.  If it's a problem for the database, I'd rather just disable the queries for now.  Let me know and I'll make a judgement call to either go through and fix them all or disable them all.

On Wed, Jun 24, 2015 at 8:43 AM, Sean Pringle <springle@wikimedia.org> wrote:
Below are a couple examples of recent slow queries running on
analytics-store. They both use subqueries and UNION. Neither push the
WHERE conditions down to the subqueries, which means indexes are not
used. If you include WHERE clauses in both inner and outer queries you
will achieve index usage, especially for these sorts of eventlogging
queries on event_action or wiki.

MySQL / MariaDB query optimizer does do constant propagation, but
UNION tends to trip it up in all sorts of exciting ways. Be verbose,
abandon DRY principle, and make your life easier :-)

Current query:

SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM (
    SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY,
@num:=@num+1), '%Y-%m-%d' ) AS Date
        FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT JOIN (
        SELECT DATE(timestamp) AS Date,
        SUM(1) AS Web FROM (
            SELECT timestamp, wiki, event_username, event_action,
event_namespace, event_userEditCount FROM MobileWebEditing_5644223
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                FROM MobileWebEditing_6077315
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_6637866
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_7675117
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_8599025
        ) as MobileWebEditing
    WHERE event_namespace != 0 AND event_action = 'success' AND wiki
!= 'testwiki'
    GROUP BY Date
) AS Web ON Month.Date = Web.Date;

Version with WHERE clauses pushed down to subqueries:

SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM (
    SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY,
@num:=@num+1), '%Y-%m-%d' ) AS Date
        FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT JOIN (
        SELECT DATE(timestamp) AS Date,
        SUM(1) AS Web FROM (
            SELECT timestamp, wiki, event_username, event_action,
event_namespace, event_userEditCount FROM MobileWebEditing_5644223
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                FROM MobileWebEditing_6077315 WHERE event_namespace !=
0 AND event_action = 'success' AND wiki != 'testwiki'
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_6637866 WHERE event_namespace !=
0 AND event_action = 'success' AND wiki != 'testwiki'
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_7675117 WHERE event_namespace !=
0 AND event_action = 'success' AND wiki != 'testwiki'
            UNION SELECT timestamp, wiki, event_username,
event_action, event_namespace, event_userEditCount
                from MobileWebEditing_8599025 WHERE event_namespace !=
0 AND event_action = 'success' AND wiki != 'testwiki'
        ) as MobileWebEditing
    WHERE event_namespace != 0 AND event_action = 'success' AND wiki
!= 'testwiki'
    GROUP BY Date
) AS Web ON Month.Date = Web.Date;

Current query (hours):

SELECT day,
`Attempted saves` / `Loads` AS "All wikis",
`Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki",
`Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki",
`Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki",
`Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki",
`Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki",
`Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM (
    SELECT DATE( timestamp ) AS day,
    SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads",
    SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS
"Loads - dewiki",
    SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS
"Loads - enwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS
"Loads - eswiki",
    SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS
"Loads - frwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS
"Loads - plwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS
"Loads - ptwiki",
    SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 )
) AS "Attempted saves - dewiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 )
) AS "Attempted saves - enwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 )
) AS "Attempted saves - eswiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 )
) AS "Attempted saves - frwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 )
) AS "Attempted saves - plwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 )
) AS "Attempted saves - ptwiki" FROM (
        SELECT event_action, wiki, timestamp
        FROM Edit_10604157
        UNION ALL
        SELECT event_action, wiki, timestamp
        FROM Edit_10676603
    ) AS valid_edit_schema_revisions
    GROUP BY day
    ORDER BY day
) as internalQuery;

Version with WHERE clauses pushed down to subqueries:

SELECT day,
`Attempted saves` / `Loads` AS "All wikis",
`Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki",
`Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki",
`Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki",
`Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki",
`Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki",
`Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM (
    SELECT DATE( timestamp ) AS day,
    SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads",
    SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS
"Loads - dewiki",
    SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS
"Loads - enwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS
"Loads - eswiki",
    SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS
"Loads - frwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS
"Loads - plwiki",
    SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS
"Loads - ptwiki",
    SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 )
) AS "Attempted saves - dewiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 )
) AS "Attempted saves - enwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 )
) AS "Attempted saves - eswiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 )
) AS "Attempted saves - frwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 )
) AS "Attempted saves - plwiki",
    SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 )
) AS "Attempted saves - ptwiki" FROM (
        SELECT event_action, wiki, timestamp
        FROM Edit_10604157
        WHERE event_action in ('saveAttempt', 'ready')
        UNION ALL
        SELECT event_action, wiki, timestamp
        FROM Edit_10676603
        WHERE event_action in ('saveAttempt', 'ready')
    ) AS valid_edit_schema_revisions
    GROUP BY day
    ORDER BY day
) as internalQuery;

--
DBA @ WMF

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics