SQL JOIN 查找没有特定值的匹配记录的记录
我正在尝试加快几年前为雇主的购买授权应用程序编写的一些代码的速度。基本上我有一个慢子查询,我想用 JOIN 替换它(如果它更快)。
当主管登录应用程序时,他会看到他尚未授权或拒绝的购买请求列表。该列表是通过以下查询生成的:
SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN
(SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
sa_order 中只有大约 900 条记录,sa_signature 中只有大约 1800 条记录,并且该查询仍然需要大约 5 秒的时间来执行。我尝试使用 LEFT JOIN 来检索我需要的记录,但我只能获取 sa_order 记录,而 sa_signature 中没有匹配的记录,并且我需要 sa_order 记录,其中“没有具有‘行政主管’类型的匹配记录” ”。非常感谢您的帮助!
两个表的架构如下:
涉及的表具有以下布局:
CREATE TABLE sa_order
(
`order_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_number` BIGINT NOT NULL,
`submit_date` DATE NOT NULL,
`vendor_id` BIGINT NOT NULL,
`DENIED` BOOLEAN NOT NULL DEFAULT FALSE,
`MEMO` MEDIUMTEXT,
`year_id` BIGINT NOT NULL,
`advisor` VARCHAR(255) NOT NULL,
`deleted` BOOLEAN NOT NULL DEFAULT FALSE
);
CREATE TABLE sa_signature
(
`signature_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_id` BIGINT NOT NULL,
`signature` VARCHAR(255) NOT NULL,
`proxy` BOOLEAN NOT NULL DEFAULT FALSE,
`timestamp` TIMESTAMP NOT NULL DEFAULT NOW(),
`username` VARCHAR(255) NOT NULL,
`type` VARCHAR(255) NOT NULL
);
I'm trying to speed up some code that I wrote years ago for my employer's purchase authorization app. Basically I have a SLOW subquery that I'd like to replace with a JOIN (if it's faster).
When the director logs into the application he sees a list of purchase requests he has yet to authorize or deny. That list is generated with the following query:
SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN
(SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
There are only about 900 records in sa_order and 1800 records in sa_signature and this query still takes about 5 seconds to execute. I've tried using a LEFT JOIN to retrieve records I need, but I've only been able to get sa_order records with NO matching records in sa_signature, and I need sa_order records with "no matching records with a type of 'administrative director'". Your help is greatly appreciated!
The schema for the two tables is as follows:
The tables involved have the following layout:
CREATE TABLE sa_order
(
`order_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_number` BIGINT NOT NULL,
`submit_date` DATE NOT NULL,
`vendor_id` BIGINT NOT NULL,
`DENIED` BOOLEAN NOT NULL DEFAULT FALSE,
`MEMO` MEDIUMTEXT,
`year_id` BIGINT NOT NULL,
`advisor` VARCHAR(255) NOT NULL,
`deleted` BOOLEAN NOT NULL DEFAULT FALSE
);
CREATE TABLE sa_signature
(
`signature_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_id` BIGINT NOT NULL,
`signature` VARCHAR(255) NOT NULL,
`proxy` BOOLEAN NOT NULL DEFAULT FALSE,
`timestamp` TIMESTAMP NOT NULL DEFAULT NOW(),
`username` VARCHAR(255) NOT NULL,
`type` VARCHAR(255) NOT NULL
);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在
sa_signatures (type, order_id)
上创建索引。除非
sa_signatures
允许order_id
中存在空值,否则不需要将查询转换为LEFT JOIN
。有了索引,NOT IN
也将执行。不过,以防万一您感到好奇:您应该从
sa_signatures
中选择一个NOT NULL
列,以便WHERE
子句能够良好地执行。Create an index on
sa_signatures (type, order_id)
.This is not necessary to convert the query into a
LEFT JOIN
unlesssa_signatures
allows nulls inorder_id
. With the index, theNOT IN
will perform as well. However, just in case you're curious:You should pick a
NOT NULL
column fromsa_signatures
for theWHERE
clause to perform well.您可以将 [NOT] IN 运算符替换为 EXISTS 以获得更快的性能。
所以你会得到:
原因:“当使用“NOT IN”时,查询执行嵌套全表扫描,而对于“NOT EXISTS”,查询可以在子查询中使用索引。”
来源:http://decipherinfosys.wordpress.com/2007/01/21/32/
You could replace the [NOT] IN operator with EXISTS for faster performance.
So you'll have:
Reason : "When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query."
Source : http://decipherinfosys.wordpress.com/2007/01/21/32/
以下查询应该有效,但是我怀疑您真正的问题是您没有适当的索引。您应该在 SA_SGINATURES 表的 ORDER_ID 列上有一个索引。
This following query should work, however I suspect your real issue is you don't have the proper indices in place. You should have an index on the SA_SGINATURES table on the ORDER_ID column.
select * from sa_order as o inner join sa_signature as s on o.orderid = sa.orderid and sa.type = 'administrativedirector'
另外,您可以在 sa_signature 表中的类型上创建非聚集索引
,效果更好 - 有一个主表类型包含 typeid 和 typename,然后不要将类型保存为 sa_signature 表中的文本,而只需将类型保存为整数。那是因为整数计算比文本计算快得多
select * from sa_order as o inner join sa_signature as s on o.orderid = sa.orderid and sa.type = 'administrative director'
also, you can create a non clustered index on type in sa_signature table
even better - have a master table for types with typeid and typename, and then instead of saving type as text in your sa_signature table, simply save type as integer. thats because computing on integers is way faster than computing on text