查询执行计划中出现意外的估计行 (Sql Server 2000)
如果我运行这个查询
select user from largetable where largetable.user = 1155
(注意我查询用户只是为了将其减少到最简单的情况)
并查看执行计划,计划进行索引查找[largetable在用户上有一个索引],并且估计的行数是正确的29但是
如果我这样做
select user from largetable where largetable.user = (select user from users where externalid = 100)
[子查询的结果是单个值1155,就像上面我硬编码时一样]
查询优化器估计结果中有117,000行。 Largetable 大约有 6,000,000 行,users 大约有 1700 行。当我运行查询时,尽管估计行数很大,但我当然得到了正确的 29 行。
我已经在相关索引的两个表上使用全扫描更新了统计信息,当我查看统计信息时,它们似乎是正确的。
值得注意的是,对于任何给定用户,largetable 中的行数不超过 3,000 行。
那么,为什么估计的执行计划会显示如此大量的估计行呢?难道优化器不应该根据统计数据知道它正在寻找具有 29 个相应行或最多 3,000 行的结果,即使它不知道子查询将选择的用户也是如此?为什么会有这么大的估计?问题是,这个大的估计会影响更大查询中的另一个连接来进行扫描而不是查找。如果我使用子查询运行较大的查询,则需要 1 分 40 秒。如果使用 1155 硬编码运行它需要 2 秒。这对我来说很不寻常......
谢谢,
克里斯
if I run this query
select user from largetable where largetable.user = 1155
(note I'm querying user just to reduce this to its simplest case)
And look at the execution plan, an index seek is planned [largetable has an index on user], and the estimated rows is the correct 29.
But if I do
select user from largetable where largetable.user = (select user from users where externalid = 100)
[with the result of the sub query being the single value 1155 just like above when i hard code it]
The query optimizer estimates 117,000 rows in the result. There are about 6,000,000 rows in largetable, 1700 rows in users. When I run the query of course I get back the correct 29 rows despite the huge estimated rows.
I have updated stats with fullscan on both tables on the relevent indexes, and when I look at the stats, they appear to be correct.
Of note, for any given user, there are no more than 3,000 rows in largetable.
So, why would the estimated execution plan show such a large number of estimated rows? Shouldn't the optimizer know, based on the stats, that it's looking for a result that has 29 corresponding rows, or a MAXIMUM of 3,000 rows even if it doesn't know the user which will be selected by the subquery? Why this huge estimate? The problem is, that this large estimate is then influencing another join in a larger query to do a scan instead of a seek. If I run the larger query with the subquery, it takes 1min 40 secs. If run it with the 1155 hard coded it takes 2 seconds. This is very unusual to me...
Thanks,
Chris
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
优化器会尽力而为,但统计信息和行数估计只能到此为止(正如您所看到的)。
我假设您的更复杂的查询无法轻松地重写为没有子查询的联接。如果可以的话,您应该先尝试一下。
如果做不到这一点,您就需要利用有关数据性质的额外知识来帮助优化器使用 提示。具体查看
index
提示中的forceseek
选项。请注意,如果您的数据稍后发生更改,这可能会很糟糕,因此请注意。The optimizer does the best it can, but statistics and row count estimations only go so far (as you're seeing).
I'm assuming that your more complex query can't easily be rewritten as a join without a subquery. If it can be, you should attempt that first.
Failing that, it's time for you to use your additional knowledge about the nature of your data to help out the optimizer with hints. Specifically look at the
forceseek
option in theindex
hints. Note that this can be bad if your data changes later, so be aware.你尝试过这个吗?
请参阅:subqueries-vs-joins
Did you try this?
Please see this: subqueries-vs-joins