组合 - 如何优化此MySQL查询?
SELECT results.idpatient AS nhs_number, TIMESTAMPDIFF(YEAR, STR_TO_DATE(results.dob,'%Y-%m-%d'), CURDATE()) AS age, most_overdue.days_overdue,
most_overdue.current_status, most_overdue.action, indications.associated_indications
FROM mytable AS results
INNER JOIN (
SELECT
idpatient,
MAX(days_overdue) as days_overdue,
SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
FROM
mytable
GROUP BY
idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
FROM mytable
GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action
GROUP BY results.idpatient
LIMIT 0,100;
上面的查询需要〜4s才能在我的大型MySQL DB上运行,并且问题似乎是小组。
请参阅简化的小提琴在这里。
我在优化组上阅读了MySQL的页面,但这似乎并不能与group_concat结合使用。
架构如下:
CREATE TABLE `mytable` (
`idpatient` varchar(32) NOT NULL,
`indication` varchar(255) NOT NULL,
`action` varchar(255) NOT NULL,
`current_status` varchar(255) NOT NULL,
`query_type` varchar(255) NOT NULL,
`last_date` varchar(255) NOT NULL,
`days_overdue` bigint(20) DEFAULT NULL,
`dob` varchar(255) NOT NULL,
PRIMARY KEY (`idpatient`,`indication`,`action`),
KEY `action_idx` (`action`),
KEY `indication_idx` (`indication`),
KEY `idpatient_action_idx` (`idpatient`,`action`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
解释告诉我,在运行慢速查询时,使用了指示_idx和iDpatient_action_idx。
感谢您的任何帮助,都将不胜感激,谢谢!
SELECT results.idpatient AS nhs_number, TIMESTAMPDIFF(YEAR, STR_TO_DATE(results.dob,'%Y-%m-%d'), CURDATE()) AS age, most_overdue.days_overdue,
most_overdue.current_status, most_overdue.action, indications.associated_indications
FROM mytable AS results
INNER JOIN (
SELECT
idpatient,
MAX(days_overdue) as days_overdue,
SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
FROM
mytable
GROUP BY
idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
FROM mytable
GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action
GROUP BY results.idpatient
LIMIT 0,100;
The above query takes ~4s to run on my big MySQL DB, and the issue seems to be the GROUP BYs.
See a simplified Fiddle here .
I read MySQL's page on optimizing GROUP BY, but this doesn't seem to cover its use in conjunction with GROUP_CONCAT.
The schema is as follows:
CREATE TABLE `mytable` (
`idpatient` varchar(32) NOT NULL,
`indication` varchar(255) NOT NULL,
`action` varchar(255) NOT NULL,
`current_status` varchar(255) NOT NULL,
`query_type` varchar(255) NOT NULL,
`last_date` varchar(255) NOT NULL,
`days_overdue` bigint(20) DEFAULT NULL,
`dob` varchar(255) NOT NULL,
PRIMARY KEY (`idpatient`,`indication`,`action`),
KEY `action_idx` (`action`),
KEY `indication_idx` (`indication`),
KEY `idpatient_action_idx` (`idpatient`,`action`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
EXPLAIN tells me the indication_idx and idpatient_action_idx are in use when running the slow query.
Any help towards getting this under 1s would be much appreciated, thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我想,没有任何证据支持我的猜测,您希望您的结果集显示一百个最低
inpatient
值。您的查询是由两个子查询和一个主要查询组成的。让我们从第一个子查询开始。
请注意,我将您的限制条款包括在此子查询中。
该子查询将受益于以下覆盖索引。
索引的前两个列由操作支持您的
组。如果您的表小于100k行,则可以省略该索引的最后两个列;他们只能使查找
current_status
和action
更快。索引到位检查该子查询的性能;查看
解释
输出。说服自己,这足以实现您的目的。现在,对于第二个子查询。您的
IDPATIENT_ACTION_IDX
索引有助于此查询。您应该添加按 /限制条款的订单。您也可以将指示
添加到该索引中,以使其成为覆盖索引。如果需要,进行索引更改。然后还检查该子查询的性能。如果两个子查询都具有可接受的性能,那么您的主要查询可能也会。
I guess, without any evidence to support my guess, that you want your resultset to show the hundred lowest
idpatient
values.Your query is made of two subqueries and a main query. Let's start with the first subquery.
Notice that I included your LIMIT clause in this subquery.
This subquery will benefit from the following covering index.
The first two columns of the index support your
GROUP BY
operation. If your table is smaller than something like 100k rows you can omit the last two columns from that index; they only serve to make the lookup ofcurrent_status
andaction
a bit faster.With the index in place examine the performance of that subquery; look at
EXPLAIN
output. Convince yourself it's fast enough for your purposes.Now for the second subquery. Your
idpatient_action_idx
index helps with this query. You should add the ORDER BY / LIMIT clauses to it. You could also addindication
to that index to make it into a covering index.Make the index change if you want to. Then examine the performance of that subquery as well. If both the subqueries have acceptable performance your main query probably will too.
此查询是否给出正确的结果(我认为它仍然很慢)?
这可能是一种与大多数days_overdue相关的Current_Status/Action得出Current_Status/Action的更清洁方法:
我不禁认为应该可以至少删除至少1次访问Mytable,但是现在我看不到如何访问
Does this query give the correct result (I assume it is still slow)?
This may be a cleaner way to derive the current_status/action associated with most days_overdue:
I can't help thinking that it should be possible to remove at least 1 access against mytable, but right now I don't see how