优化 UNION MySQL 查询,其中两个查询使用大量相同的表
我在 MySQL 中使用 UNION 来将两个单独查询的结果组合在一起。 这两个查询使用很多相同的表。有没有办法利用这一点来优化查询?
SELECT 2 AS RELEVANCE_SCORE_TYPE,
VIEWER_ID,
QUESTION_ID,
sum(ANSWER_SCORE) AS RELEVANCE_SCORE
FROM(SELECT cr.COMMUNICATIONS_ID AS ANSWER_ID,
cr.CONSUMER_ID as VIEWER_ID,
nc.PARENT_COMMUNICATIONS_ID AS QUESTION_ID,
case when cr.CONSUMER_ID= nc.SENDER_CONSUMER_ID then 3*((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
else ((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
end as ANSWER_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc
ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal
ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim
ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm
ON cr.CONSUMER_ID=cirm.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces
ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON cr.CONSUMER_ID=scs.CONSUMER_ID_2
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR cr.CONSUMER_ID=scs.CONSUMER_ID_1
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
(
SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
FROM CONSUMER_ACTION_LOG AS cal
JOIN (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
ON cal.SENDER_CONSUMER_ID=cr.consumer_id
) ATV) AS ASM
GROUP BY ANSWER_ID
UNION
SELECT 1 AS RELEVANCE_SCORE_TYPE,
qcr.CONSUMER_ID AS Viewer_ID,
qcr.COMMUNICATIONS_ID,
case when reply.replies IS NOT NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(-reply.replies))
when reply.replies IS NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(0))
when reply.replies IS NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(0))
when reply.replies IS NOT NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(-reply.replies))
else null
end as QUESTION_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
SENDER_CONSUMER_ID,
COMMUNICATIONS_ID,
LAST_MOD_TIME
FROM network_communications
WHERE NETWORK_COMM_TYPE_ID=1) AS qcr
JOIN communication_interest_mapping AS cim
ON qcr.COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_expert_score AS ces
ON ces.CONSUMER_ID=qcr.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON qcr.CONSUMER_ID=scs.CONSUMER_ID_2
AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR qcr.CONSUMER_ID=scs.CONSUMER_ID_1
AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
LEFT JOIN (SELECT COUNT(*) AS replies,
PARENT_COMMUNICATIONS_ID
FROM network_communications AS nc1
WHERE NETWORK_COMM_TYPE_ID=2
GROUP BY PARENT_COMMUNICATIONS_ID) AS reply
ON qcr.COMMUNICATIONS_ID=reply.PARENT_COMMUNICATIONS_ID
ORDER BY RELEVANCE_SCORE DESC;
FROM (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc
ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal
ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim
ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm
ON cr.CONSUMER_ID=cirm.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces
ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON cr.CONSUMER_ID=scs.CONSUMER_ID_2
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR cr.CONSUMER_ID=scs.CONSUMER_ID_1
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
(
SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
FROM CONSUMER_ACTION_LOG AS cal
JOIN (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
ON cal.SENDER_CONSUMER_ID=cr.consumer_id
) ATV) AS ASM
GROUP BY ANSWER_ID
ORDER BY ANSWER_SCORE_SUMMED DESC;
它很长,所以不必觉得有义务阅读整篇文章。要点很简单,联合两侧的查询使用许多相同的表。
I am using the UNION in MySQL in order to bring together the results of two separate queries.
These two queries use a lot of the same tables. Is there anyway to take advantage of this in order to optimize the query?
SELECT 2 AS RELEVANCE_SCORE_TYPE,
VIEWER_ID,
QUESTION_ID,
sum(ANSWER_SCORE) AS RELEVANCE_SCORE
FROM(SELECT cr.COMMUNICATIONS_ID AS ANSWER_ID,
cr.CONSUMER_ID as VIEWER_ID,
nc.PARENT_COMMUNICATIONS_ID AS QUESTION_ID,
case when cr.CONSUMER_ID= nc.SENDER_CONSUMER_ID then 3*((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
else ((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
end as ANSWER_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc
ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal
ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim
ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm
ON cr.CONSUMER_ID=cirm.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces
ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON cr.CONSUMER_ID=scs.CONSUMER_ID_2
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR cr.CONSUMER_ID=scs.CONSUMER_ID_1
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
(
SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
FROM CONSUMER_ACTION_LOG AS cal
JOIN (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
ON cal.SENDER_CONSUMER_ID=cr.consumer_id
) ATV) AS ASM
GROUP BY ANSWER_ID
UNION
SELECT 1 AS RELEVANCE_SCORE_TYPE,
qcr.CONSUMER_ID AS Viewer_ID,
qcr.COMMUNICATIONS_ID,
case when reply.replies IS NOT NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(-reply.replies))
when reply.replies IS NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(0))
when reply.replies IS NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(0))
when reply.replies IS NOT NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(-reply.replies))
else null
end as QUESTION_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
SENDER_CONSUMER_ID,
COMMUNICATIONS_ID,
LAST_MOD_TIME
FROM network_communications
WHERE NETWORK_COMM_TYPE_ID=1) AS qcr
JOIN communication_interest_mapping AS cim
ON qcr.COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_expert_score AS ces
ON ces.CONSUMER_ID=qcr.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON qcr.CONSUMER_ID=scs.CONSUMER_ID_2
AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR qcr.CONSUMER_ID=scs.CONSUMER_ID_1
AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
LEFT JOIN (SELECT COUNT(*) AS replies,
PARENT_COMMUNICATIONS_ID
FROM network_communications AS nc1
WHERE NETWORK_COMM_TYPE_ID=2
GROUP BY PARENT_COMMUNICATIONS_ID) AS reply
ON qcr.COMMUNICATIONS_ID=reply.PARENT_COMMUNICATIONS_ID
ORDER BY RELEVANCE_SCORE DESC;
FROM (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc
ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal
ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim
ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm
ON cr.CONSUMER_ID=cirm.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces
ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs
ON cr.CONSUMER_ID=scs.CONSUMER_ID_2
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1
OR cr.CONSUMER_ID=scs.CONSUMER_ID_1
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
(
SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
FROM CONSUMER_ACTION_LOG AS cal
JOIN (SELECT 234 AS CONSUMER_ID,
ACTION_LOG_ID,
COMMUNICATIONS_ID
FROM consumer_action_log
WHERE COMM_TYPE_ID=4) AS cr
ON cal.SENDER_CONSUMER_ID=cr.consumer_id
) ATV) AS ASM
GROUP BY ANSWER_ID
ORDER BY ANSWER_SCORE_SUMMED DESC;
It's long, so don't feel obligated to read the whole thing. The gist is simply that the queries on both sides of the union utilizes a lot of the same tables.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我的第一个想法是,不值得尝试超越 MySQL 优化器。特别是因为您正在进行 18 次连接和 2 次交叉连接。
UNION 两侧的查询使用大量相同的表并不罕见。我希望优化器能够处理这种情况。
为了获得更好的性能,您需要查看执行配置文件,并且可能需要重写查询以消除交叉联接。要查看执行配置文件,请运行这两个命令之一。
My first thought was that it's not worth trying to out-think the MySQL optimizer. Especially since you're doing 18 joins and 2 cross joins.
It's not unusual for queries on both sides of a UNION to use a lot of the same tables. It's the kind of situation I'd expect the optimizer to be able to deal with.
To get better performance, you'll need to look at the execution profile and, probably, rewrite the query to eliminate the cross joins. To look at the execution profile, run one of these two.
看起来顶部和底部之间有一些细微的差别。因此,不要认为没有
UNION
就可以轻松重写。但是,如果您确定来自顶部和底部查询的数据不会重叠,请使用UNION ALL
,其中UNION
尝试使结果不同,从而使引擎执行额外的工作,如果有很多记录,这可能会相当大。根据 @Catcall 的建议,使用
EXPLAIN
将验证UNION ALL
方法。Looks there are some subtle differences between the top and bottom. So don't think you can easily rewrite without the
UNION
. However, if you're certain the data from the top and bottom queries will not overlap useUNION ALL
, whereUNION
attempts to make the results distinct making the engine do additional work, which can be considerable if there are a lot of records.As per @Catcall's suggestion, using
EXPLAIN
will validate theUNION ALL
approach.