与 WHERE 子句一起使用时优化 Oracle CONNECT BY
Oracle START WITH ... CONNECT BY
子句在同一查询中在应用WHERE
条件之前应用。 因此,WHERE 约束无助于优化 CONNECT BY
。
例如,以下查询可能会执行全表扫描(忽略 dept_id
上的选择性):
SELECT * FROM employees
WHERE dept_id = 'SALE'
START WITH manager_id is null
CONNECT BY PRIOR employee_id = manager_id
我尝试通过两种方式提高性能:
查询 A:
SELECT * FROM employees
START WITH manager_id is null AND dept_id = 'SALE'
CONNECT BY PRIOR employee_id = manager_id
查询 B:
SELECT * FROM (
SELECT * FROM employees
WHERE dept_id = 'SALE'
)
START WITH manager_id is null
CONNECT BY PRIOR employee_id = manager_id
虽然这两个查询的效果都比原始查询好得多,在 Oracle 10g 第 2 版上,查询 B 的性能确实比 A 好得多。
对于 CONNECT BY
和 WHERE
子句,您是否有类似的性能优化需要处理? 您如何解释查询 B 比查询 A 做得更好?
Oracle START WITH ... CONNECT BY
clause is applied before applying WHERE
condition in the same query. Thus, WHERE constraints won't help optimize CONNECT BY
.
For example, the following query will likely perform full table scan (ignoring selectivity on dept_id
):
SELECT * FROM employees
WHERE dept_id = 'SALE'
START WITH manager_id is null
CONNECT BY PRIOR employee_id = manager_id
I tried to improve performance in 2 ways:
query A:
SELECT * FROM employees
START WITH manager_id is null AND dept_id = 'SALE'
CONNECT BY PRIOR employee_id = manager_id
query B:
SELECT * FROM (
SELECT * FROM employees
WHERE dept_id = 'SALE'
)
START WITH manager_id is null
CONNECT BY PRIOR employee_id = manager_id
While both queries did much better than original, on Oracle 10g Release 2, query B did performed much better than A.
Did you have similar performance optimization to deal with with respect to CONNECT BY
and WHERE
clauses? How would you explain query B doing much better than query A?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
查询 A 表示从销售部门的经理开始,然后获取他们的所有员工。 Oracle 不“知道”查询返回的所有员工都在销售部门,因此在执行 CONNECT 之前它无法使用该信息来减少要使用的数据集经过。
查询 B 显式将要处理的数据集减少到仅限销售部门的员工,Oracle 可以在执行 CONNECT BY 之前执行此操作。
Query A says start with managers in the Sales department and then get all their employees. Oracle doesn't "know" that all the employees returned be the query will be in the Sales department, so it can't use that information to reduce the set of data to work with before performing the CONNECT BY.
Query B explicitly reduces the set of data to be worked on to just those employees in Sales, which Oracle can then do before performing the CONNECT BY.
这应该会提供最终的性能:
请注意,您确实需要索引和
AND
条件才能使优化发挥作用。This should give the ultimate performance:
Note that you do need both index and both
AND
conditions for the optimization to work.这是一个类似的查询,长话短说,使用嵌套 sql 比使用先前选项的双连接运行得更快。
所以我没有太多专业知识的建议是使用嵌套sql来过滤。
This is a similar query, long story short it worked faster using the nested sql than the double connect by prior option.
So my recommendation without much expertise is to use the nested sql to filter.
员工的指标有哪些? 你最好有一个关于employeeid 的索引。 由于将employeeid 声明为主键,您可能确实拥有了一个。
使用 managerid 上的索引也可能会获得更好的性能。 尝试一下。 在插入新员工或重组管理关系时,必须与性能下降相平衡。
What are the indexes on employees? You better have an index on employeeid. And you likely do have one as a consequence of declaring employeeid as the primary key.
You might get better performance with an index on managerid as well. Try it. This has to be balanced against slower performance when inserting new employees or reorganizing managing relationships.