优化 UNION MySQL 查询,其中两个查询使用大量相同的表

发布于 2024-10-14 19:29:01 字数 5566 浏览 2 评论 0原文

我在 MySQL 中使用 UNION 来将两个单独查询的结果组合在一起。 这两个查询使用很多相同的表。有没有办法利用这一点来优化查询?

    SELECT  2 AS RELEVANCE_SCORE_TYPE,
        VIEWER_ID, 
        QUESTION_ID, 
        sum(ANSWER_SCORE) AS RELEVANCE_SCORE
FROM(SELECT  cr.COMMUNICATIONS_ID AS ANSWER_ID, 
        cr.CONSUMER_ID as VIEWER_ID,
        nc.PARENT_COMMUNICATIONS_ID AS QUESTION_ID,
        case when  cr.CONSUMER_ID= nc.SENDER_CONSUMER_ID then 3*((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
            else ((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
        end as ANSWER_SCORE
FROM (SELECT 238 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
UNION
SELECT 1 AS RELEVANCE_SCORE_TYPE,
       qcr.CONSUMER_ID AS Viewer_ID, 
       qcr.COMMUNICATIONS_ID, 
       case when reply.replies IS NOT NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(-reply.replies))
       when reply.replies IS NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(0))
       when reply.replies IS NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(0))
       when reply.replies IS NOT NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(-reply.replies))
            else null
       end as QUESTION_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
            SENDER_CONSUMER_ID, 
            COMMUNICATIONS_ID,
            LAST_MOD_TIME
     FROM network_communications
     WHERE NETWORK_COMM_TYPE_ID=1) AS qcr
JOIN communication_interest_mapping AS cim 
    ON qcr.COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_expert_score AS ces
    ON ces.CONSUMER_ID=qcr.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON qcr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR qcr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
LEFT JOIN (SELECT COUNT(*) AS replies, 
           PARENT_COMMUNICATIONS_ID
           FROM network_communications AS nc1
           WHERE NETWORK_COMM_TYPE_ID=2
GROUP BY PARENT_COMMUNICATIONS_ID) AS reply
    ON qcr.COMMUNICATIONS_ID=reply.PARENT_COMMUNICATIONS_ID
ORDER BY RELEVANCE_SCORE DESC;
FROM (SELECT 234 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
ORDER BY ANSWER_SCORE_SUMMED DESC;

它很长,所以不必觉得有义务阅读整篇文章。要点很简单,联合两侧的查询使用许多相同的表。

I am using the UNION in MySQL in order to bring together the results of two separate queries.
These two queries use a lot of the same tables. Is there anyway to take advantage of this in order to optimize the query?

    SELECT  2 AS RELEVANCE_SCORE_TYPE,
        VIEWER_ID, 
        QUESTION_ID, 
        sum(ANSWER_SCORE) AS RELEVANCE_SCORE
FROM(SELECT  cr.COMMUNICATIONS_ID AS ANSWER_ID, 
        cr.CONSUMER_ID as VIEWER_ID,
        nc.PARENT_COMMUNICATIONS_ID AS QUESTION_ID,
        case when  cr.CONSUMER_ID= nc.SENDER_CONSUMER_ID then 3*((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
            else ((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
        end as ANSWER_SCORE
FROM (SELECT 238 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
UNION
SELECT 1 AS RELEVANCE_SCORE_TYPE,
       qcr.CONSUMER_ID AS Viewer_ID, 
       qcr.COMMUNICATIONS_ID, 
       case when reply.replies IS NOT NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(-reply.replies))
       when reply.replies IS NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(0))
       when reply.replies IS NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(0))
       when reply.replies IS NOT NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(-reply.replies))
            else null
       end as QUESTION_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
            SENDER_CONSUMER_ID, 
            COMMUNICATIONS_ID,
            LAST_MOD_TIME
     FROM network_communications
     WHERE NETWORK_COMM_TYPE_ID=1) AS qcr
JOIN communication_interest_mapping AS cim 
    ON qcr.COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_expert_score AS ces
    ON ces.CONSUMER_ID=qcr.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON qcr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR qcr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
LEFT JOIN (SELECT COUNT(*) AS replies, 
           PARENT_COMMUNICATIONS_ID
           FROM network_communications AS nc1
           WHERE NETWORK_COMM_TYPE_ID=2
GROUP BY PARENT_COMMUNICATIONS_ID) AS reply
    ON qcr.COMMUNICATIONS_ID=reply.PARENT_COMMUNICATIONS_ID
ORDER BY RELEVANCE_SCORE DESC;
FROM (SELECT 234 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
ORDER BY ANSWER_SCORE_SUMMED DESC;

It's long, so don't feel obligated to read the whole thing. The gist is simply that the queries on both sides of the union utilizes a lot of the same tables.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

奢望 2024-10-21 19:29:01

我的第一个想法是,不值得尝试超越 MySQL 优化器。特别是因为您正在进行 18 次连接和 2 次交叉连接。

UNION 两侧的查询使用大量相同的表并不罕见。我希望优化器能够处理这种情况。

为了获得更好的性能,您需要查看执行配置文件,并且可能需要重写查询以消除交叉联接。要查看执行配置文件,请运行这两个命令之一。

EXPLAIN <your query>
EXPLAIN EXTENDED <your query>

My first thought was that it's not worth trying to out-think the MySQL optimizer. Especially since you're doing 18 joins and 2 cross joins.

It's not unusual for queries on both sides of a UNION to use a lot of the same tables. It's the kind of situation I'd expect the optimizer to be able to deal with.

To get better performance, you'll need to look at the execution profile and, probably, rewrite the query to eliminate the cross joins. To look at the execution profile, run one of these two.

EXPLAIN <your query>
EXPLAIN EXTENDED <your query>
夏有森光若流苏 2024-10-21 19:29:01

看起来顶部和底部之间有一些细微的差别。因此,不要认为没有 UNION 就可以轻松重写。但是,如果您确定来自顶部和底部查询的数据不会重叠,请使用 UNION ALL,其中 UNION 尝试使结果不同,从而使引擎执行额外的工作,如果有很多记录,这可能会相当大。

根据 @Catcall 的建议,使用 EXPLAIN 将验证 UNION ALL 方法。

Looks there are some subtle differences between the top and bottom. So don't think you can easily rewrite without the UNION. However, if you're certain the data from the top and bottom queries will not overlap use UNION ALL, where UNION attempts to make the results distinct making the engine do additional work, which can be considerable if there are a lot of records.

As per @Catcall's suggestion, using EXPLAIN will validate the UNION ALL approach.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文