为什么从我的SQL查询中删除二进制函数调用会如此大大更改查询计划?
我有一个SQL查询,该查询在表中查找特定值,然后在三个表中进行内部连接以获取结果集。这三个表是 Fabric_barcode_oc
, Fabric_barcode_items
& Fabric_barcode_rolls
初始查询
在运行 divelly Analyaze
对此时,查询的初始版本在下面
EXPLAIN ANALYZE
SELECT `oc`.`oc_number` AS `ocNumber` , `roll`.`po_number` AS `poNumber` ,
`item`.`item_code` AS `itemCode` , `roll`.`roll_length` AS `rollLength` ,
`roll`.`roll_utilized` AS `rollUtilized`
FROM `fabric_barcode_rolls` AS `roll`
INNER JOIN `fabric_barcode_oc` AS `oc` ON `oc`.`oc_unique_id` = `roll`.`oc_unique_id`
INNER JOIN `fabric_barcode_items` AS `item` ON `item`.`item_unique_id` = `roll`.`item_unique_id_fk`
WHERE BINARY `roll`.`roll_number` = 'dZkzHJ_je8'
,我会得到以下
"-> Nested loop inner join (cost=468160.85 rows=582047) (actual time=0.063..254.186 rows=1 loops=1)
-> Nested loop inner join (cost=264444.40 rows=582047) (actual time=0.057..254.179 rows=1 loops=1)
-> Filter: (cast(roll.roll_number as char charset binary) = 'dZkzHJ_je8') (cost=60727.95 rows=582047) (actual time=0.047..254.169 rows=1 loops=1)
-> Table scan on roll (cost=60727.95 rows=582047) (actual time=0.042..198.634 rows=599578 loops=1)
-> Single-row index lookup on oc using PRIMARY (oc_unique_id=roll.oc_unique_id) (cost=0.25 rows=1) (actual time=0.009..0.009 rows=1 loops=1)
-> Single-row index lookup on item using PRIMARY (item_unique_id=roll.item_unique_id_fk) (cost=0.25 rows=1) (actual time=0.006..0.006 rows=1 loops=1)
"
更新的查询
我将查询更改为
EXPLAIN ANALYZE
SELECT `oc`.`oc_number` AS `ocNumber` , `roll`.`po_number` AS `poNumber` ,
`item`.`item_code` AS `itemCode` , `roll`.`roll_length` AS `rollLength` ,
`roll`.`roll_utilized` AS `rollUtilized`
FROM `fabric_barcode_rolls` AS `roll`
INNER JOIN `fabric_barcode_oc` AS `oc` ON `oc`.`oc_unique_id` = `roll`.`oc_unique_id`
INNER JOIN `fabric_barcode_items` AS `item` ON `item`.`item_unique_id` = `roll`.`item_unique_id_fk`
WHERE `roll`.`roll_number` = 'dZkzHJ_je8'
,这将生成以下执行计划,
"-> Rows fetched before execution (cost=0.00 rows=1) (actual time=0.000..0.000 rows=1 loops=1)
两个查询之间的唯一区别是我从查询中删除了 binary
函数调用。我为何计划如此不同而感到困惑?
执行时间
查询1的执行时间约为375ms,而第二个查询的执行时间约为160ms。
是什么造成这种差异?
根据
要求
fabric_barcode_rolls,"CREATE TABLE `fabric_barcode_rolls` (
`roll_unique_id` int NOT NULL AUTO_INCREMENT,
`oc_unique_id` int NOT NULL,
`item_unique_id_fk` int NOT NULL,
`roll_number` char(30) NOT NULL,
`roll_length` decimal(10,2) DEFAULT '0.00',
`po_number` char(22) DEFAULT NULL,
`roll_utilized` decimal(10,2) DEFAULT '0.00',
`user` char(30) NOT NULL,
`mir_number` char(22) DEFAULT NULL,
`mir_location` char(10) DEFAULT NULL,
`mir_stamp` datetime DEFAULT NULL,
`creation_stamp` datetime DEFAULT CURRENT_TIMESTAMP,
`update_stamp` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`roll_unique_id`),
UNIQUE KEY `roll_number` (`roll_number`),
KEY `fabric_barcode_item_fk` (`item_unique_id_fk`),
CONSTRAINT `fabric_barcode_item_fk` FOREIGN KEY (`item_unique_id_fk`) REFERENCES `fabric_barcode_items` (`item_unique_id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=610684 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的性能差异是由于这一事实:在MySQL中,collations collations collations collations collations和char()列被烘烤到索引中。
编辑更新以匹配表定义。
您的
Fabric_barcode_rolls
表具有这样的列:因此,您的
where ... binary roll.roll_number ='dzkzhj_je8'
filter strause是 not sargable :它无法在该列上使用索引。但是where ... roll.roll_number ='dzkzhj_je8'
是可靠的:它确实使用了索引。所以很快。但是该列的默认整理是不敏感的。因此,这是快与错。可以解决。
请注意,该列上没有整理声明。这意味着它使用表的默认值:
utf8mb4_0900_ai_ai_ci
,一种对案例不敏感的整理。您想要的普通条形码列是一个单字节的charset和一个对案例敏感的整理。这将改变您的桌子来做到这一点。
这是一个多级胜利。使用适合条形码的正确字符集可以节省数据。它使索引更短,更有效。它可以进行病例敏感的(二进制匹配)查找,它们本身使索引短并且更有效地使用了索引。而且它不会在带有上部和下情况特征集的条形码之间运行碰撞风险。
在您得出结论认为碰撞风险是如此之低之前,您不必担心它,请阅读有关生日悖论的信息。
Your performance difference is due to this fact: in MySQL, collations on VARCHAR() and CHAR() columns are baked into the indexes.
Edit updated to match the table definition.
Your
fabric_barcode_rolls
table has a column defined like this:So, your
WHERE ... BINARY roll.roll_number = 'dZkzHJ_je8'
filter clause is not sargable: it can't use the index on that column. ButWHERE ... roll.roll_number = 'dZkzHJ_je8'
is sargable: it does use the index. So it's fast. But the column's default collation is case-insensitive. So, it's fast and wrong.That can be fixed.
Notice there's no collation declaration on the column. That means it's using the table's default:
utf8mb4_0900_ai_ci
, a case-insensitive collation.What you want for an ordinary barcode column is a one-byte-per-character charset and a case-sensitive collation. This would change your table to do that.
This is a multilevel win. Using the correct character set for your barcodes saves data. It makes the indexes shorter and more efficient to use. It does case-sensitive (binary-match) lookups, which themselves make indexes shorter and much more efficient to use. And it doesn't run the collision risk between barcodes with upper and lower case character sets.
Before you conclude that the collision risk is so low you don't have to worry about it, please read about the birthday paradox.