MySQL 连接优化：改进派生表和 GROUP BY 的连接类型

发布于 2024-11-16 13:34:59 字数 5138 浏览 8 评论 0原文

我正在尝试改进执行以下操作的查询：

对于每项工作，将所有成本相加，将发票金额相加，并计算利润/损失。成本来自几个不同的表，例如采购订单、用户事件（工程师分配的时间/他在现场花费的时间）、使用的库存等。
该查询还需要输出一些其他列，例如工作站点的名称，以便可以对该列进行排序（在所有这些之后附加 ORDER BY）。

SELECT
    jobs.job_id,
    jobs.start_date,
    jobs.end_date,
    events.time,
    sites.name site,
    IFNULL(stock_cost,0) stock_cost,
    labour,
    materials,
    labour+materials+plant+expenses revenue,
    (labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
    ((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio

FROM
    jobs

    LEFT JOIN (
        SELECT
            job_id,
            SUM(labour_charge) labour,
            SUM(materials_charge) materials,
            SUM(plant_hire_charge) plant,
            SUM(expenses) expenses
        FROM invoices
        GROUP BY job_id
        ORDER BY NULL
    ) invoices USING(job_id)

    LEFT JOIN (
        SELECT
            job_id,
            SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
            SUM(travel+parking+materials) user_expenses
        FROM users_events
        WHERE type='job'
        GROUP BY job_id
        ORDER BY NULL
    ) events USING(job_id)

    LEFT JOIN (
        SELECT
            job_id,
            SUM(IFNULL(total,0))*0.01 orders_cost
        FROM purchaseorders
        GROUP BY job_id
        ORDER BY NULL
    ) purchaseorders USING(job_id)

    LEFT JOIN (
        SELECT
            location job_id,
            SUM(amount*cost))*0.01 stock_cost
        FROM stock_location
        LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
        WHERE location>=3000 AND amount>0 AND cost>0
        GROUP BY location
        ORDER BY NULL
    ) stock USING(job_id)

    LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;

我读到了这个： http://dev.mysql.com /doc/refman/5.0/en/group-by-optimization.html 但不知道如何/是否可以应用其中的任何内容。出于测试目的，我尝试在左侧、右侧和中心字段上添加各种索引，但对 EXPLAIN 输出没有任何改进：

+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table          | type   | possible_keys          | key     | key_len | ref                                | rows  | Extra                         |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
|  1 | PRIMARY     | jobs           | ALL    | NULL                   | NULL    | NULL    | NULL                               |  7088 |                               |
|  1 | PRIMARY     | <derived2>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |  5038 |                               |
|  1 | PRIMARY     | <derived3>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |  6476 |                               |
|  1 | PRIMARY     | <derived4>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |   904 |                               |
|  1 | PRIMARY     | <derived5>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |   531 |                               |
|  1 | PRIMARY     | sites          | eq_ref | PRIMARY                | PRIMARY | 4       | bestbee_db.jobs.site_id            |     1 |                               |
|  5 | DERIVED     | stock_location | ALL    | stock,location,amount,…| NULL    | NULL    | NULL                               |  5426 | Using where; Using temporary; |
|  5 | DERIVED     | stock_items    | eq_ref | PRIMARY                | PRIMARY | 4       | bestbee_db.stock_location.stock_id |     1 | Using where                   |
|  4 | DERIVED     | purchaseorders | ALL    | NULL                   | NULL    | NULL    | NULL                               |  1445 | Using temporary;              |
|  3 | DERIVED     | users_events   | ALL    | type,type_job          | NULL    | NULL    | NULL                               | 11295 | Using where; Using temporary; |
|  2 | DERIVED     | invoices       | ALL    | NULL                   | NULL    | NULL    | NULL                               |  5320 | Using temporary;              |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+

生成的行数为 5 x 10^21（低于我开始优化此查询之前的 3 x 10^42）！）
目前执行需要 7 秒（从 26 秒减少），但我希望能低于 1 秒。

顺便说一句：GROUP BY x ORDER BY NULL 是从子查询中消除不必要的文件排序的好方法！（来自 http://www.mysqlperformanceblog.com/ 2006/09/04/group_concat-useful-group-by-extension/)

原文

I am trying to improve a query which does the following:

For every job, add up all the costs, add up the invoiced amount, and calculate a profit/loss. The costs come from several different tables, e.g. purchaseorders, users_events (engineer allocated time/time he spent on site), stock used etc.
The query also needs to output some other columns like the name of the site for the work, so that that column can be sorted by (an ORDER BY is appended after all of this).

SELECT
    jobs.job_id,
    jobs.start_date,
    jobs.end_date,
    events.time,
    sites.name site,
    IFNULL(stock_cost,0) stock_cost,
    labour,
    materials,
    labour+materials+plant+expenses revenue,
    (labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
    ((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio

FROM
    jobs

    LEFT JOIN (
        SELECT
            job_id,
            SUM(labour_charge) labour,
            SUM(materials_charge) materials,
            SUM(plant_hire_charge) plant,
            SUM(expenses) expenses
        FROM invoices
        GROUP BY job_id
        ORDER BY NULL
    ) invoices USING(job_id)

    LEFT JOIN (
        SELECT
            job_id,
            SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
            SUM(travel+parking+materials) user_expenses
        FROM users_events
        WHERE type='job'
        GROUP BY job_id
        ORDER BY NULL
    ) events USING(job_id)

    LEFT JOIN (
        SELECT
            job_id,
            SUM(IFNULL(total,0))*0.01 orders_cost
        FROM purchaseorders
        GROUP BY job_id
        ORDER BY NULL
    ) purchaseorders USING(job_id)

    LEFT JOIN (
        SELECT
            location job_id,
            SUM(amount*cost))*0.01 stock_cost
        FROM stock_location
        LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
        WHERE location>=3000 AND amount>0 AND cost>0
        GROUP BY location
        ORDER BY NULL
    ) stock USING(job_id)

    LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;

I read this: http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html but don't see how/if I can apply anything therein.
For testing purposes, I have tried adding all sorts of indices on fields left, right and centre with no improvement to the EXPLAIN output:

+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table          | type   | possible_keys          | key     | key_len | ref                                | rows  | Extra                         |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
|  1 | PRIMARY     | jobs           | ALL    | NULL                   | NULL    | NULL    | NULL                               |  7088 |                               |
|  1 | PRIMARY     | <derived2>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |  5038 |                               |
|  1 | PRIMARY     | <derived3>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |  6476 |                               |
|  1 | PRIMARY     | <derived4>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |   904 |                               |
|  1 | PRIMARY     | <derived5>     | ALL    | NULL                   | NULL    | NULL    | NULL                               |   531 |                               |
|  1 | PRIMARY     | sites          | eq_ref | PRIMARY                | PRIMARY | 4       | bestbee_db.jobs.site_id            |     1 |                               |
|  5 | DERIVED     | stock_location | ALL    | stock,location,amount,…| NULL    | NULL    | NULL                               |  5426 | Using where; Using temporary; |
|  5 | DERIVED     | stock_items    | eq_ref | PRIMARY                | PRIMARY | 4       | bestbee_db.stock_location.stock_id |     1 | Using where                   |
|  4 | DERIVED     | purchaseorders | ALL    | NULL                   | NULL    | NULL    | NULL                               |  1445 | Using temporary;              |
|  3 | DERIVED     | users_events   | ALL    | type,type_job          | NULL    | NULL    | NULL                               | 11295 | Using where; Using temporary; |
|  2 | DERIVED     | invoices       | ALL    | NULL                   | NULL    | NULL    | NULL                               |  5320 | Using temporary;              |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+

The rows produced is 5 x 10^21 (down from 3 x 10^42 before I started optimising this query!)
It currently takes seven seconds to execute (down from 26) but I would like that to be under one second.

By the way: GROUP BY x ORDER BY NULL is a great way to eliminate unnecessary filesorts from subqueries! (from http://www.mysqlperformanceblog.com/2006/09/04/group_concat-useful-group-by-extension/)

分享到QQ

分享到微博