MySQL性能,内连接,如何避免使用临时和文件排序

发布于 2024-09-11 14:05:10 字数 819 浏览 9 评论 0原文

我有表 1 和表 2。

表 1 零件号 - ID_BRAND partnum 是主键 id_brand 已“索引”

表 2 ID_BRAND - BRAND_NAME id_brand 是主键 brand_name 已“建立索引”

表 1 包含 100 万条记录,表 2 包含 1.000 条记录。

我正在尝试使用 EXPLAIN 优化一些查询,经过多次尝试后我已经走到了死胡同。

EXPLAIN 
SELECT pm.partnum, pb.brand_name
FROM products_main AS pm 
LEFT JOIN products_brands AS pb ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

查询返回此执行计划:

ID, SELECT_TYPE, TABLE, TYPE, POSSIBLE_KEYS, KEY, KEY_LEN , REF, ROWS, EXTRA
1, SIMPLE, pm, range, PRIMARY, PRIMARY, 1, , 1000000, Using where; Using temporary; Using filesort
1, SIMPLE, pb, ref, PRIMARY, PRIMARY, 4, demo.pm.id_pbrand, 1,

MySQL 查询优化器在执行计划中显示临时 + 文件排序。 我怎样才能避免这种情况?

“EVIL”位于 ORDER BY pb.brand ASC 中。通过外部字段订购似乎是瓶颈。

I have a table 1 and table 2.

Table 1
PARTNUM - ID_BRAND
partnum is the primary key
id_brand is "indexed"

Table 2
ID_BRAND - BRAND_NAME
id_brand is the primary key
brand_name is "indexed"

The table 1 contains 1 million of records and the table 2 contains 1.000 records.

I'm trying to optimize some query using EXPLAIN and after a lot of try I have reached a dead end.

EXPLAIN 
SELECT pm.partnum, pb.brand_name
FROM products_main AS pm 
LEFT JOIN products_brands AS pb ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

The query returns this execution plan:

ID, SELECT_TYPE, TABLE, TYPE, POSSIBLE_KEYS, KEY, KEY_LEN , REF, ROWS, EXTRA
1, SIMPLE, pm, range, PRIMARY, PRIMARY, 1, , 1000000, Using where; Using temporary; Using filesort
1, SIMPLE, pb, ref, PRIMARY, PRIMARY, 4, demo.pm.id_pbrand, 1,

The MySQL query optimizer shows a temporary + filesort in the execution plan.
How can I avoid this?

The "EVIL" is in the ORDER BY pb.brand ASC. Ordering by that external field seems to be the bottleneck..

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

只是偏爱你 2024-09-18 14:05:10

首先,我质疑外连接的使用,因为 order by 是在 rhs 上操作的,而左连接注入的 NULL 可能会对其造成严重破坏。

无论如何,加速此查询的最简单方法是对 pb.id_brand 和 pb.brand 建立覆盖索引。这将允许通过“使用索引”和连接条件来评估排序依据。另一种方法是找到某种方法来减少传递给 order-by 的中间结果的大小。

尽管如此,外连接、排序依据和限制的组合让我想知道您到底在查询什么,以及是否有更好的方式来表达查询本身。

First of all, I question the use of an outer join seeing as the order by is operating on the rhs, and the NULL's injected by the left join are likely to play havoc with it.

Regardless, the simplest approach to speeding up this query would be a covering index on pb.id_brand and pb.brand. This will allow the order by to be evaluated 'using index' with the join condition. The alternative is to find some way to reduce the size of the intermediate result passed to the order-by.

Still, the combination of outer-join, order-by, and limit, leaves me wondering what exactly you are querying for, and if there might not be a better way of expressing the query itself.

路弥 2024-09-18 14:05:10

尝试用子查询替换联接。 MySQL 的优化器有点糟糕;子查询通常比连接提供更好的性能。

Try replacing the join with a subquery. MySQL's optimizer kind of sucks; subqueries often give better performance than joins.

晨曦慕雪 2024-09-18 14:05:10

首先,尝试更改 products_brands 表上的索引。删除 brand_name 上的现有索引,然后创建一个新索引:

ALTER TABLE products_brands ADD INDEX newIdx (brand_name, id_brand)

然后,该表将已经有一个“orderedByBrandName”索引,其中包含您连接所需的 id,您可以尝试:

EXPLAIN
SELECT pb.brand_name, pm.partnum
FROM products_brands AS pb 
  LEFT JOIN products_main AS pm ON pb.id_brand = pm.id_brand
LIMIT 0, 10

请注意,我也更改了查询中表的顺序,因此您从表开始。

First, try changing your index on the products_brands table. Delete the existing one on brand_name, and create a new one:

ALTER TABLE products_brands ADD INDEX newIdx (brand_name, id_brand)

Then, the table will already have a "orderedByBrandName" index with the ids you need for the join, and you can try:

EXPLAIN
SELECT pb.brand_name, pm.partnum
FROM products_brands AS pb 
  LEFT JOIN products_main AS pm ON pb.id_brand = pm.id_brand
LIMIT 0, 10

Note that I also changed the order of the tables in the query, so you start with the small one.

流星番茄 2024-09-18 14:05:10

这个问题有点过时了,但我确实找到了,其他人也会找到。

如果 ORDER BY 或 GROUP BY 包含除连接队列中第一个表之外的表中的列,Mysql 将使用临时值。

因此,您只需要使用 STRAIGHT_JOIN 反转连接顺序,以绕过优化器发明的顺序:

SELECT STRAIGHT_JOIN pm.partnum, pb.brand_name
FROM products_brands AS pb 
RIGHT JOIN products_main AS pm ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

同时确保 max_heap_table_size 和 tmp_table_size 变量设置为足够大的数字来存储结果:

SET global tmp_table_size=100000000;
SET global max_heap_table_size=100000000;

- 本例中为 100 MB。这些也可以在 my.cnf 配置文件中设置。

This question is somewhat outdated, but I did find it, and so will other people.

Mysql uses temporary if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue.

So you just need to have the join order reversed by using STRAIGHT_JOIN, to bypass the order invented by optimizer:

SELECT STRAIGHT_JOIN pm.partnum, pb.brand_name
FROM products_brands AS pb 
RIGHT JOIN products_main AS pm ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

Also make sure that max_heap_table_size AND tmp_table_size variables are set to a number big enough to store the results:

SET global tmp_table_size=100000000;
SET global max_heap_table_size=100000000;

-- 100 megabytes in this example. These can be set in my.cnf config file, too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文