SQL查询性能分析
哪个会表现更好?
问题 1:
select (select a from innertable I where i.val=o.val)
, val1, val2
from outertable o
问题 2:
select i.a
,o.val1
,o.val2
from outertable o
join innertable i on i.val=o.val
为什么?请指教。
Which will perform better?
Ouery 1:
select (select a from innertable I where i.val=o.val)
, val1, val2
from outertable o
Query 2:
select i.a
,o.val1
,o.val2
from outertable o
join innertable i on i.val=o.val
Why ? Please advise.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如 Ollie 所建议的,确定两个查询中哪一个更有效的唯一确定方法是使用您的数据对这两种方法进行基准测试,因为两种替代方案的性能可能取决于数据量、数据结构、存在哪些索引等 。
一般来说,您发布的两个查询将返回不同的结果 除非您保证
outertable
中的每一行在innertable
中都有一个对应的行,否则这两个查询将返回不同的行数。如果innertable
中没有匹配的行,第一个查询将为outertable
中的每一行返回一行,并将 NULL 作为第一列。如果innertable
中没有匹配的行,第二个查询将不会返回任何内容。同样,如果innertable
中对于outertable
中的任何特定行有多个匹配行,则第一个查询将返回错误,而第二个查询将返回该行的多个行外部表
。如果您确信这两个查询在您的特定情况下返回相同的结果集,因为您可以保证
innertable
中的每一行对应于outertable
中的一行(其中如果您的数据模型将表分开,这至少有点奇怪),第二个选项将是编写查询的更自然的方式,因此优化器最有可能找到更有效的计划。As Ollie suggests, the only definitive way to determine which of two queries is more efficient is to benchmark the two approaches using your data since the performance of the two alternatives is likely to depend on data volumes, data structures, what indexes are present, etc.
In general, the two queries that you posted will return different results. Unless you are guaranteed that every row in
outertable
has exactly one corresponding row ininnertable
, the two queries will return a different number of rows. The first query will return a row for every row inoutertable
with a NULL as the first column if there is no matching row ininnertable
. The second query will not return anything if there is no matching row ininnertable
. Similarly, if there are multiple matching rows ininnertable
for any particular row inoutertable
, the first query will return an error while the second query will return multiple rows for that row inoutertable
.If you are confident that the two queries return identical result sets in your particular case because you can guarantee that there is exactly one row in
innertable
for every row inoutertable
(in which case it is at least somewhat odd that your data model separates the tables), the second option would be the much more natural way to write the query and thus the one for which the optimizer is most likely to find the more efficient plan.