查询优化不适合左外连接
考虑下表;
测试
SQL> desc test;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NUMBER(10)
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
2 2 2
test2
SQL> desc test2;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NOT NULL NUMBER
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
3 1 1
根据书本,如果要将 NOT IN 条件替换为 LEFT 外连接;查询性能将会提高。
解释计划 1
SQL> select *
from test
where num NOT IN (select num
from test2);
NUM NUM2 NUM3
--- --------- ----------
2 2 2
执行计划
----------------------------------------------------------
Plan hash value: 856752680
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 5 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| TEST | 2 | 4 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST2 | 2 | 26 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
LNNVL("NUM"<>:B1)))
3 - filter(LNNVL("NUM"<>:B1))
因此,我在两个表的 num 列上创建了索引,并将查询重写为:
解释计划 2
SQL> select *
from test
left join test2 on (test.num = test2.num)
where test2.num is null;
NUM NUM2 NUM3 NUM NUM2 NUM3
---------- ---------- ---------- ---------- ---------- ----------
2 2 2
Execution Plan
----------------------------------------------------------
Plan hash value: 1525288557
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Tim
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 82 | 4 (0)| 00:
|* 1 | FILTER | | | | |
| 2 | NESTED LOOPS OUTER | | 2 | 82 | 4 (0)| 00:
| 3 | TABLE ACCESS FULL | TEST | 2 | 4 | 3 (0)| 00:
| 4 | TABLE ACCESS BY INDEX ROWID| TEST2 | 1 | 39 | 1 (0)| 00:
|* 5 | INDEX RANGE SCAN | ID2 | 1 | | 0 (0)| 00:
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST2"."NUM" IS NULL)
5 - access("TEST"."NUM"="TEST2"."NUM"(+))
我显然错过了一些东西,因为建议的方法被证明更昂贵。这是由于我的数据分布造成的吗?
版本:Oracle 10g
Consider the following tables;
test
SQL> desc test;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NUMBER(10)
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
2 2 2
test2
SQL> desc test2;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NOT NULL NUMBER
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
3 1 1
As per the book , if one were to replace a NOT IN condition to a LEFT Outer join ; the query performance would increase.
Explain plan 1
SQL> select *
from test
where num NOT IN (select num
from test2);
NUM NUM2 NUM3
--- --------- ----------
2 2 2
Execution Plan
----------------------------------------------------------
Plan hash value: 856752680
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 5 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| TEST | 2 | 4 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST2 | 2 | 26 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
LNNVL("NUM"<>:B1)))
3 - filter(LNNVL("NUM"<>:B1))
So I created indexes on num columns of both tables and re-wrote the query as :
Explain plan 2
SQL> select *
from test
left join test2 on (test.num = test2.num)
where test2.num is null;
NUM NUM2 NUM3 NUM NUM2 NUM3
---------- ---------- ---------- ---------- ---------- ----------
2 2 2
Execution Plan
----------------------------------------------------------
Plan hash value: 1525288557
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Tim
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 82 | 4 (0)| 00:
|* 1 | FILTER | | | | |
| 2 | NESTED LOOPS OUTER | | 2 | 82 | 4 (0)| 00:
| 3 | TABLE ACCESS FULL | TEST | 2 | 4 | 3 (0)| 00:
| 4 | TABLE ACCESS BY INDEX ROWID| TEST2 | 1 | 39 | 1 (0)| 00:
|* 5 | INDEX RANGE SCAN | ID2 | 1 | | 0 (0)| 00:
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST2"."NUM" IS NULL)
5 - access("TEST"."NUM"="TEST2"."NUM"(+))
I am clearly missing something as the suggested method is proving to be more expensive. Is this due to my data distribution?
Ver: Oracle 10g
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的两个查询返回不同的结果集。第二个版本还返回第二个表中的数据。我预计从两个表返回数据的查询比仅从一个表返回数据的查询更昂贵。
尝试使用
select test.*
而不是select *
进行第二个查询,看看是否有区别。Your two queries return different result sets. The second version also returns data from the second table. I would expect a query which returns data from two tables to be more expensive than a query which returns data from just the one.
Try the second query with
select test.*
instead ofselect *
and see whether it makes a difference.