组合 - 如何优化此MySQL查询?

发布于 2025-02-10 18:34:09 字数 1900 浏览 4 评论 0原文

SELECT results.idpatient AS nhs_number, TIMESTAMPDIFF(YEAR, STR_TO_DATE(results.dob,'%Y-%m-%d'), CURDATE()) AS age, most_overdue.days_overdue,
most_overdue.current_status, most_overdue.action, indications.associated_indications
FROM mytable AS results
INNER JOIN (
    SELECT
        idpatient,
        MAX(days_overdue) as days_overdue,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
    FROM
        mytable
    GROUP BY
        idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action
GROUP BY results.idpatient
LIMIT 0,100;

上面的查询需要〜4s才能在我的大型MySQL DB上运行,并且问题似乎是小组。

请参阅简化的小提琴在这里

我在优化组上阅读了MySQL的页面,但这似乎并不能与group_concat结合使用。

架构如下:

CREATE TABLE `mytable` (
  `idpatient` varchar(32) NOT NULL,
  `indication` varchar(255) NOT NULL,
  `action` varchar(255) NOT NULL,
  `current_status` varchar(255) NOT NULL,
  `query_type` varchar(255) NOT NULL,
  `last_date` varchar(255) NOT NULL,
  `days_overdue` bigint(20) DEFAULT NULL,
  `dob` varchar(255) NOT NULL,
  PRIMARY KEY (`idpatient`,`indication`,`action`),
  KEY `action_idx` (`action`),
  KEY `indication_idx` (`indication`),
  KEY `idpatient_action_idx` (`idpatient`,`action`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

解释告诉我,在运行慢速查询时,使用了指示_idx和iDpatient_action_idx。

感谢您的任何帮助,都将不胜感激,谢谢!

SELECT results.idpatient AS nhs_number, TIMESTAMPDIFF(YEAR, STR_TO_DATE(results.dob,'%Y-%m-%d'), CURDATE()) AS age, most_overdue.days_overdue,
most_overdue.current_status, most_overdue.action, indications.associated_indications
FROM mytable AS results
INNER JOIN (
    SELECT
        idpatient,
        MAX(days_overdue) as days_overdue,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
    FROM
        mytable
    GROUP BY
        idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action
GROUP BY results.idpatient
LIMIT 0,100;

The above query takes ~4s to run on my big MySQL DB, and the issue seems to be the GROUP BYs.

See a simplified Fiddle here .

I read MySQL's page on optimizing GROUP BY, but this doesn't seem to cover its use in conjunction with GROUP_CONCAT.

The schema is as follows:

CREATE TABLE `mytable` (
  `idpatient` varchar(32) NOT NULL,
  `indication` varchar(255) NOT NULL,
  `action` varchar(255) NOT NULL,
  `current_status` varchar(255) NOT NULL,
  `query_type` varchar(255) NOT NULL,
  `last_date` varchar(255) NOT NULL,
  `days_overdue` bigint(20) DEFAULT NULL,
  `dob` varchar(255) NOT NULL,
  PRIMARY KEY (`idpatient`,`indication`,`action`),
  KEY `action_idx` (`action`),
  KEY `indication_idx` (`indication`),
  KEY `idpatient_action_idx` (`idpatient`,`action`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

EXPLAIN tells me the indication_idx and idpatient_action_idx are in use when running the slow query.

Any help towards getting this under 1s would be much appreciated, thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

飞烟轻若梦 2025-02-17 18:34:09

我想,没有任何证据支持我的猜测,您希望您的结果集显示一百个最低inpatient值。

您的查询是由两个子查询和一个主要查询组成的。让我们从第一个子查询开始。

SELECT 
  idpatient,
  MAX(days_overdue) as days_overdue,
  SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
  SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
FROM    mytable
GROUP BY idpatient
ORDER BY idpatient
LIMIT    0, 100;

请注意,我将您的限制条款包括在此子查询中。

该子查询将受益于以下覆盖索引

ALTER TABLE mytable 
  ADD INDEX patient_days_status_activity
            (idpatient, days_overdue, current_status, action);

索引的前两个列由操作支持您的组。如果您的表小于100k行,则可以省略该索引的最后两个列;他们只能使查找current_statusaction更快。

索引到位检查该子查询的性能;查看解释输出。说服自己,这足以实现您的目的。

现在,对于第二个子查询。您的IDPATIENT_ACTION_IDX索引有助于此查询。您应该添加按 /限制条款的订单。您也可以将指示添加到该索引中,以使其成为覆盖索引。

ALTER TABLE mytable
  DROP INDEX idpatient_action_idx,
  ADD INDEX idpatient_action_idx (idpatient, action, indication);

如果需要,进行索引更改。然后还检查该子查询的性能。如果两个子查询都具有可接受的性能,那么您的主要查询可能也会。

I guess, without any evidence to support my guess, that you want your resultset to show the hundred lowest idpatient values.

Your query is made of two subqueries and a main query. Let's start with the first subquery.

SELECT 
  idpatient,
  MAX(days_overdue) as days_overdue,
  SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
  SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
FROM    mytable
GROUP BY idpatient
ORDER BY idpatient
LIMIT    0, 100;

Notice that I included your LIMIT clause in this subquery.

This subquery will benefit from the following covering index.

ALTER TABLE mytable 
  ADD INDEX patient_days_status_activity
            (idpatient, days_overdue, current_status, action);

The first two columns of the index support your GROUP BY operation. If your table is smaller than something like 100k rows you can omit the last two columns from that index; they only serve to make the lookup of current_status and action a bit faster.

With the index in place examine the performance of that subquery; look at EXPLAIN output. Convince yourself it's fast enough for your purposes.

Now for the second subquery. Your idpatient_action_idx index helps with this query. You should add the ORDER BY / LIMIT clauses to it. You could also add indication to that index to make it into a covering index.

ALTER TABLE mytable
  DROP INDEX idpatient_action_idx,
  ADD INDEX idpatient_action_idx (idpatient, action, indication);

Make the index change if you want to. Then examine the performance of that subquery as well. If both the subqueries have acceptable performance your main query probably will too.

白色秋天 2025-02-17 18:34:09

此查询是否给出正确的结果(我认为它仍然很慢)?

SELECT results.idpatient AS nhs_number
     , most_overdue.days_overdue
     , most_overdue.current_status
     , most_overdue.action
     , indications.associated_indications
FROM (
    SELECT distinct idpatient
    FROM mytable
) AS results
INNER JOIN (
    SELECT
        idpatient,
        MAX(days_overdue) as days_overdue,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
    FROM
        mytable
    GROUP BY
        idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action;

这可能是一种与大多数days_overdue相关的Current_Status/Action得出Current_Status/Action的更清洁方法:

SELECT results.idpatient AS nhs_number
     , (most_overdue.days_overdue)
     , (results.current_status)
     , results.action
     , indications.associated_indications
FROM mytable AS results
JOIN (
    SELECT 
        idpatient,
        MAX(days_overdue) as days_overdue
    FROM mytable
    GROUP BY idpatient
) AS most_overdue
    ON results.idpatient = most_overdue.idpatient
    AND results.days_overdue = most_overdue.days_overdue
JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
     ON results.idpatient = indications.idpatient 
     AND results.action = indications.action
-- GROUP BY results.idpatient, results.action     
ORDER BY results.idpatient;

我不禁认为应该可以至少删除至少1次访问Mytable,但是现在我看不到如何访问

Does this query give the correct result (I assume it is still slow)?

SELECT results.idpatient AS nhs_number
     , most_overdue.days_overdue
     , most_overdue.current_status
     , most_overdue.action
     , indications.associated_indications
FROM (
    SELECT distinct idpatient
    FROM mytable
) AS results
INNER JOIN (
    SELECT
        idpatient,
        MAX(days_overdue) as days_overdue,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), current_status)), 12) AS current_status,
        SUBSTRING(MAX(CONCAT(LPAD(days_overdue, 11, '0'), action)), 12) AS action
    FROM
        mytable
    GROUP BY
        idpatient
) AS most_overdue
ON results.idpatient = most_overdue.idpatient
INNER JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
ON results.idpatient = indications.idpatient AND most_overdue.action = indications.action;

This may be a cleaner way to derive the current_status/action associated with most days_overdue:

SELECT results.idpatient AS nhs_number
     , (most_overdue.days_overdue)
     , (results.current_status)
     , results.action
     , indications.associated_indications
FROM mytable AS results
JOIN (
    SELECT 
        idpatient,
        MAX(days_overdue) as days_overdue
    FROM mytable
    GROUP BY idpatient
) AS most_overdue
    ON results.idpatient = most_overdue.idpatient
    AND results.days_overdue = most_overdue.days_overdue
JOIN (
    SELECT idpatient, action, GROUP_CONCAT(indication SEPARATOR ',') as associated_indications
    FROM mytable
    GROUP BY idpatient, action
) AS indications
     ON results.idpatient = indications.idpatient 
     AND results.action = indications.action
-- GROUP BY results.idpatient, results.action     
ORDER BY results.idpatient;

I can't help thinking that it should be possible to remove at least 1 access against mytable, but right now I don't see how

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文