MySQL 查询极其缓慢

发布于 2024-12-04 17:51:15 字数 4557 浏览 0 评论 0原文

我很乐意接受任何关于此的建议 - 无论是重写查询,还是以不同的方式设置表。

我基本上有三个表 - 产品表、位置表和条件表。位置表存储有关整个时间位置的所有信息,与条件相同。这个大规模查询的技巧是仅根据最新的条件和位置来提取产品。

我从这个问题中得到了总体思路: MySQL MIN/MAX 返回正确的值,但不返回相关记录信息

答案只是将当前位置和条件存储在主产品表中,并保留这些历史表,但不使用它们进行搜索?我喜欢将它们分开的想法,但当然这个查询需要 50 秒才能运行,这根本不实用。

SELECT 
'$table' AS tablename, 
$table.id, 
product_name, 
$table.status,
CL.event AS last_event,
CONCAT_WS(' ', CL.location, CL.floor, CL.bin, CL.bay) AS current_loc,
CC.status AS current_cond
FROM $table

LEFT OUTER JOIN
    (SELECT DISTINCT
        C.work_type,
        C.work_id,
        C.status,
        C.inspected_timestamp
        FROM
        (SELECT 
            CONCAT(work_type, work_id) AS condition_id, 
            status,
            MAX(inspected_timestamp) as current
            FROM conditions
            GROUP BY condition_id
        ) XC
    JOIN conditions C
      on CONCAT(C.work_type, C.work_id) = XC.condition_id
      and C.inspected_timestamp = XC.current
    ) CC ON 
    $table.id = CC.work_id AND 
    CC.work_type = '$table'                         

LEFT OUTER JOIN
    (SELECT DISTINCT
        L.work_type,
        L.work_id,
        L.event,
        L.location,
        L.floor,
        L.bin,
        L.bay,
        L.timestamp
        FROM
        (SELECT
            CONCAT(work_type, work_id) AS location_id, 
            location,
            MAX(timestamp) as current
            FROM locations
            GROUP BY location_id
        ) XL
    JOIN locations L
      on CONCAT(L.work_type, L.work_id) = XL.location_id
      and L.timestamp = XL.current
    ) CL ON 
    $table.id = CL.work_id AND 
    CL.work_type = '$table'

HAVING last_event = 'Received'

我在这里添加了 EXTENDED EXPLAIN 的结果。

[0] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => paintings 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1159 
    [filtered] => 100.00 
    [Extra] => )

[1] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3211 
    [filtered] => 100.00 
    [Extra] => ) 

[2] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1870 
    [filtered] => 100.00 
    [Extra] => ) 

[3] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1868 
    [filtered] => 100.00 
    [Extra] => Using temporary )

[4] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => L 
    [type] => ref 
    [possible_keys] => timestamp 
    [key] => timestamp 
    [key_len] => 8 
    [ref] => XL.current 
    [rows] => 5 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[5] => Array ( 
    [id] => 5 
    [select_type] => DERIVED 
    [table] => locations 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3913 
    [filtered] => 100.00 
    [Extra] => Using temporary; Using filesort ) 

[6] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3191 
    [filtered] => 100.00 
    [Extra] => Using temporary ) 

[7] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => C 
    [type] => ref 
    [possible_keys] => inspected_timestamp 
    [key] => inspected_timestamp 
    [key_len] => 8 
    [ref] => XC.current 
    [rows] => 45 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[8] => Array (
     [id] => 3 
    [select_type] => DERIVED 
    [table] => conditions 
    [type] => index 
    [possible_keys] => 
    [key] => work_type_2 
    [key_len] => 316 
    [ref] => 
    [rows] => 3986 
    [filtered] => 100.00 
    [Extra] => Using index; Using temporary; Using filesort )

I'm happily taking any advice on this - be it rewriting the query, or setting up the tables differently.

What I have basically is three tables - a product table, a location table, and a condition table. The location tables store all the information about location throughout time, same with condition. The trick of this massive query is to pluck out the products with only their latest conditions and locations.

I took the general idea from this question: MySQL MIN/MAX returning proper value, but not the related record info

Is the answer just to store the current location and condition in the main product table, and keep these history tables, but not use them to search by? I like the idea of keeping them separate, but of course this query takes 50 seconds to run, which not practical at all.

SELECT 
'$table' AS tablename, 
$table.id, 
product_name, 
$table.status,
CL.event AS last_event,
CONCAT_WS(' ', CL.location, CL.floor, CL.bin, CL.bay) AS current_loc,
CC.status AS current_cond
FROM $table

LEFT OUTER JOIN
    (SELECT DISTINCT
        C.work_type,
        C.work_id,
        C.status,
        C.inspected_timestamp
        FROM
        (SELECT 
            CONCAT(work_type, work_id) AS condition_id, 
            status,
            MAX(inspected_timestamp) as current
            FROM conditions
            GROUP BY condition_id
        ) XC
    JOIN conditions C
      on CONCAT(C.work_type, C.work_id) = XC.condition_id
      and C.inspected_timestamp = XC.current
    ) CC ON 
    $table.id = CC.work_id AND 
    CC.work_type = '$table'                         

LEFT OUTER JOIN
    (SELECT DISTINCT
        L.work_type,
        L.work_id,
        L.event,
        L.location,
        L.floor,
        L.bin,
        L.bay,
        L.timestamp
        FROM
        (SELECT
            CONCAT(work_type, work_id) AS location_id, 
            location,
            MAX(timestamp) as current
            FROM locations
            GROUP BY location_id
        ) XL
    JOIN locations L
      on CONCAT(L.work_type, L.work_id) = XL.location_id
      and L.timestamp = XL.current
    ) CL ON 
    $table.id = CL.work_id AND 
    CL.work_type = '$table'

HAVING last_event = 'Received'

I am adding here the results of EXTENDED EXPLAIN.

[0] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => paintings 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1159 
    [filtered] => 100.00 
    [Extra] => )

[1] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3211 
    [filtered] => 100.00 
    [Extra] => ) 

[2] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1870 
    [filtered] => 100.00 
    [Extra] => ) 

[3] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1868 
    [filtered] => 100.00 
    [Extra] => Using temporary )

[4] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => L 
    [type] => ref 
    [possible_keys] => timestamp 
    [key] => timestamp 
    [key_len] => 8 
    [ref] => XL.current 
    [rows] => 5 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[5] => Array ( 
    [id] => 5 
    [select_type] => DERIVED 
    [table] => locations 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3913 
    [filtered] => 100.00 
    [Extra] => Using temporary; Using filesort ) 

[6] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3191 
    [filtered] => 100.00 
    [Extra] => Using temporary ) 

[7] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => C 
    [type] => ref 
    [possible_keys] => inspected_timestamp 
    [key] => inspected_timestamp 
    [key_len] => 8 
    [ref] => XC.current 
    [rows] => 45 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[8] => Array (
     [id] => 3 
    [select_type] => DERIVED 
    [table] => conditions 
    [type] => index 
    [possible_keys] => 
    [key] => work_type_2 
    [key_len] => 316 
    [ref] => 
    [rows] => 3986 
    [filtered] => 100.00 
    [Extra] => Using index; Using temporary; Using filesort )

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

人│生佛魔见 2024-12-11 17:51:15

您可以执行以下操作:

  1. 对查询解释计划。看看某处是否有 TABLE SCAN。这就是凶手。
  2. 查看重新排列查询是否会对 EXPLAIN PLAN 结果产生影响。尽早过滤掉更多记录将减少所需的时间。
  3. 检查以确保每个 WHERE 子句中的列都有索引。
  4. 涉及的记录越多,查询的时间就越长。您保留了多少历史?你说的是多少行?您应该制定一项策略,删除早于保留期限的记录,并将其放入历史记录或报告架构中。
  5. 您可以利用触发器或视图来预先计算这些值吗?

There are a few things you can do:

  1. EXPLAIN PLAN on the query. See if there's a TABLE SCAN in there somewhere. That's the killer.
  2. See if rearranging the query makes a difference in the EXPLAIN PLAN results. Filtering more records out early will decrease the time needed.
  3. Check to make sure that columns in every WHERE clause have an index.
  4. The more records involved, the longer the query. How much history are you retaining? How many rows are you talking about? You should have a policy that removes records older than your retention cutoff and puts them in a history or reporting schema.
  5. Can you take advantage of triggers or views to pre-calculate any of these values?
短叹 2024-12-11 17:51:15

我将其放在答案中纯粹是由于评论长度的限制。

我看了你的查询很长一段时间,我认为这很大程度上是它的本质,以及它的编写方式导致查询花费了这么多时间,但我也没有看到任何明显错误的地方。

在您进行分组以获得汇总行,然后将这些查询自连接回来的地方,虽然我不完全理解表或数据的设计,但这将是昂贵的,因为解释显示。所以这是表扫描。您也是对的,制作临时表并对它们进行排序的成本更高。

因此,如果所花费的时间根本不可接受,那么在汇总表中预先汇总并访问这些值将会有很大帮助。当您查看解释时,请注意行数,因为这可以让您很好地了解查询正在执行的操作是否合理。

此外,根据定义,末尾的having子句也不会被优化。如果有一种方法可以将其移动到 where 子句或作为其中一个连接中的条件,那么您就有机会显着改进查询计划,但考虑到摘要的成本,它仍然需要一些时间。

此时我唯一能建议的就是将其分解成小块,看看是否可以优化各个组件,然后重新组装。

I'm putting this in an answer purely due to the limit on comment length.

I looked at your query quite a while, and I think it is largely the nature of it, and the way it's been written that is causing the query to take so much time, but I don't see anything that seems obviously wrong either.

In the places where you are doing group bys in order to get a summary row, and then self joining those queries back, while I don't fully understand the design of your tables or the data, that is going to be costly, as the explain shows. So it is table scanning. You are also right that making temp tables and sorting those is even more costly.

So having those values pre-summarized and accessible in a summary table would help quite a bit if this is something where the time taken is simply unacceptable. When you look at the explain, please note the rows counts, as that should give you a good idea of whether or not what the query is doing is reasonable.

Also the having clause at the end by definition is not going to be optimized. If there is a way to move that to a where clause or as criteria in one of the joins, then you have a chance to improve the query plan significantly, but considering the cost of the summaries it will still take some time.

The only thing I can advise at this point is to break it down into small pieces and see if you can optimize the individual components and then reassemble.

我还不会笑 2024-12-11 17:51:15

正如 @gview 所解释的,有很多因素导致这个查询极其变慢。除了他的答案中提到的所有内容之外,还在两个表中使用了 CONCAT() 函数,其中的结果稍后用于连接这两个派生表。

如果您只想显示表 product 中的行,其中仅包含 location 中的最新相关行和 condition 中的最新相关行,您可以使用类似于以下内容(这仅具有最新条件的逻辑,您将需要另一个类似的LEFT JOIN来获取最新位置):

SELECT 
  t.id, 
  t.product_name, 
  t.status,
  cc.status AS current_cond
FROM 
      $table AS t
  LEFT OUTER JOIN
      ( SELECT c.*
        FROM 
              conditions AS c
          JOIN
              ( SELECT 
                  work_id, 
                  MAX(inspected_timestamp) as current_ts
                FROM conditions mc
                WHERE work_type = '$table'
                GROUP BY condition_id
              ) AS mc
            ON  mc.work_id = c.work_id
            AND mc.current_ts = c.inspected_timestamp 
        WHERE c.work_type = '$table'
      ) AS cc  
    ON cc.work_id = t.id                     

As @gview explained, there are numerous things that are helping this query to being brutally slow. Besides all those mentioned in his answer, there is also the use of CONCAT() function in two tables where the results are later used to JOIN these two derived tables.

If you just want to show the rows of table product with only the latest related row in locationand latest related row in condition, you can use something like the following (this has only the logic for the latest condition, you'll need another similar LEFT JOIN for the latest location):

SELECT 
  t.id, 
  t.product_name, 
  t.status,
  cc.status AS current_cond
FROM 
      $table AS t
  LEFT OUTER JOIN
      ( SELECT c.*
        FROM 
              conditions AS c
          JOIN
              ( SELECT 
                  work_id, 
                  MAX(inspected_timestamp) as current_ts
                FROM conditions mc
                WHERE work_type = '$table'
                GROUP BY condition_id
              ) AS mc
            ON  mc.work_id = c.work_id
            AND mc.current_ts = c.inspected_timestamp 
        WHERE c.work_type = '$table'
      ) AS cc  
    ON cc.work_id = t.id                     
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文