我的子查询增加了 20 秒的执行时间。我怎样才能加快速度?
我有一个已发送 SMS 文本消息表,必须将其连接到送达收据表才能获取消息的最新状态。
已发送短信 997,148 条。
我正在运行此查询:
SELECT
m.id,
m.user_id,
m.api_key,
m.to,
m.message,
m.sender_id,
m.route,
m.submission_reference,
m.unique_submission_reference,
m.reason_code,
m.timestamp,
d.id AS dlrid,
d.dlr_status
FROM
messages_sent m
LEFT JOIN
delivery_receipts d
ON
d.message_id = m.id
AND
d.id = (SELECT MAX(id) FROM delivery_receipts WHERE message_id = m.id)
它返回 997,148 个结果,包括每条消息的最新状态。
这需要 22.8688 秒来执行。
以下是 messages_sent
的 SQL:
CREATE TABLE IF NOT EXISTS `messages_sent` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`api_key` varchar(40) NOT NULL,
`to` varchar(15) NOT NULL,
`message` text NOT NULL,
`type` enum('sms','mms') NOT NULL DEFAULT 'sms',
`sender_id` varchar(15) NOT NULL,
`route` tinyint(1) unsigned NOT NULL,
`supplier` tinyint(1) unsigned NOT NULL,
`submission_reference` varchar(40) NOT NULL,
`unique_submission_reference` varchar(40) NOT NULL,
`reason_code` tinyint(1) unsigned NOT NULL,
`reason` text NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `api_key` (`api_key`),
KEY `sender_id` (`sender_id`),
KEY `route` (`route`),
KEY `submission_reference` (`submission_reference`),
KEY `reason_code` (`reason_code`),
KEY `timestamp` (`timestamp`),
KEY `to` (`to`),
KEY `unique_submission_reference` (`unique_submission_reference`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1000342 ;
对于 delivery_receipts
:
CREATE TABLE IF NOT EXISTS `delivery_receipts` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`message_id` int(10) unsigned NOT NULL,
`dlr_id` bigint(20) unsigned NOT NULL,
`dlr_status` tinyint(2) unsigned NOT NULL,
`dlr_substatus` tinyint(2) unsigned NOT NULL,
`dlr_final` tinyint(1) unsigned NOT NULL,
`dlr_refid` varchar(40) NOT NULL,
`dlr_phone` varchar(12) NOT NULL,
`dlr_charge` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `message_id` (`message_id`),
KEY `dlr_status` (`dlr_status`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1468592 ;
以下是 SQL 的 EXPLAIN
:
I have a table of sent SMS text messages which must join to a delivery receipt table to get the latest status of a message.
There are 997,148 sent text messages.
I am running this query:
SELECT
m.id,
m.user_id,
m.api_key,
m.to,
m.message,
m.sender_id,
m.route,
m.submission_reference,
m.unique_submission_reference,
m.reason_code,
m.timestamp,
d.id AS dlrid,
d.dlr_status
FROM
messages_sent m
LEFT JOIN
delivery_receipts d
ON
d.message_id = m.id
AND
d.id = (SELECT MAX(id) FROM delivery_receipts WHERE message_id = m.id)
Which returns 997,148 results including the latest status of each message.
This takes 22.8688 seconds to execute.
Here is the SQL for messages_sent
:
CREATE TABLE IF NOT EXISTS `messages_sent` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`api_key` varchar(40) NOT NULL,
`to` varchar(15) NOT NULL,
`message` text NOT NULL,
`type` enum('sms','mms') NOT NULL DEFAULT 'sms',
`sender_id` varchar(15) NOT NULL,
`route` tinyint(1) unsigned NOT NULL,
`supplier` tinyint(1) unsigned NOT NULL,
`submission_reference` varchar(40) NOT NULL,
`unique_submission_reference` varchar(40) NOT NULL,
`reason_code` tinyint(1) unsigned NOT NULL,
`reason` text NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `api_key` (`api_key`),
KEY `sender_id` (`sender_id`),
KEY `route` (`route`),
KEY `submission_reference` (`submission_reference`),
KEY `reason_code` (`reason_code`),
KEY `timestamp` (`timestamp`),
KEY `to` (`to`),
KEY `unique_submission_reference` (`unique_submission_reference`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1000342 ;
And for delivery_receipts
:
CREATE TABLE IF NOT EXISTS `delivery_receipts` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`message_id` int(10) unsigned NOT NULL,
`dlr_id` bigint(20) unsigned NOT NULL,
`dlr_status` tinyint(2) unsigned NOT NULL,
`dlr_substatus` tinyint(2) unsigned NOT NULL,
`dlr_final` tinyint(1) unsigned NOT NULL,
`dlr_refid` varchar(40) NOT NULL,
`dlr_phone` varchar(12) NOT NULL,
`dlr_charge` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `message_id` (`message_id`),
KEY `dlr_status` (`dlr_status`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1468592 ;
Here is an EXPLAIN
of the SQL:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
有一个技巧。
相反,通过子查询选择 MAX 元素,您可以像这样与有趣的表连接两次:
连接第二个时间表,它有一个附加条件,即您要选择 MAX 的字段应该高于第一个表中的字段。并过滤掉除那些没有更高行的行之外的所有行。
这样只保留最大行数。
我将您的 LEFT JOIN 更改为 JOIN。我不确定你是否需要 LEFT JOIN 那里。即使你它应该仍然有效。
令人惊讶的是,这比子查询快得多。
您可能想尝试相同想法的其他变体:
确保表delivery_receipts中的message_id和id字段有多列索引可能是这样的:
There is a trick.
Instead with picking MAX element with subquery you join with interesting table twice like this:
The second time table is joined it has additional condition that field that you want to pick MAX of should be higher than in the first table. And filter out all rows except the ones that do not have other row that's higher.
This way only max rows remain.
I changed your LEFT JOIN to JOIN. I'm not sure if you need LEFT JOIN there. Even if you it should still work.
Amazingly this is much faster than subquery.
You might want to try out other variant of the same idea:
Make sure you have multicolumn index for fields message_id and id in table delivery_receipts maybe such:
速度下降看起来很大,但如果你需要坚持这个查询,恐怕没有太大的改进空间。
其中一个问题是
d.dlr_status
的报告。尝试将其从报告列的列表中删除,并查看查询时间是否有所改善。如果所有内容都存储在
messages_sent
中,您将获得最佳性能。这不再是 NF,但如果您需要性能,它是一个选择。为此,请在messages_sent
中创建id
和dlr_status
列,并添加适当的INSERT
、UPDATE 和
DELETE
触发delivery_receipts
。触发器将更新messages_sent中的相应列——这是查询时间和更新时间之间的权衡。The slowdown seems large, but I'm afraid there is not much room for improvement if you need to stick with this query.
One problem is the reporting of
d.dlr_status
. Try to remove this from the list of reported columns and see if the query time improves.You would get the best possible performance if everything was stored in
messages_sent
. This won't be NF anymore, but it's an option if you need performance. To achieve this, createid
anddlr_status
columns inmessages_sent
and add appropriateINSERT
,UPDATE
andDELETE
triggers todelivery_receipts
. The triggers would update the corresponding columns inmessages_sent
-- it's a trade-off between query time and update time.您可以在delivery_receipts表中“缓存”部分计算,只需将is_last_status布尔值添加到delivery_receipts表中即可。使用简单的触发器,您可以更改每次插入新收据的值。
比 select 查询变得简单得多:
如果 mysql 支持部分索引,查询速度可能会更快。
You can "cache" part of the computation in the delivery_receipts table, just add is_last_status boolean to the delivery_receipts table. Using simple triggers you can change the value every insert of new receipt.
Than the select query becomes much simpler:
If mysql would support partial indexes the query could be speed up even more.