哪个 select 语句性能更好？

发布于 2024-09-16 05:34:21 字数 1847 浏览 4 评论 0原文

假设master_table包含许多记录，并且master_table的“id”字段、tableA、tableB、tableC和tableD在业务意义上是相同的。

对于下面所示的 2 个 select 语句，

它们都会返回相同的结果集吗？

哪一款会有更好的性能呢？

我认为如果tableA_tmp，tableB_tmp，tableC_tmp和tableD_tmp都返回较小的结果集，SQL1将比SQL2更快，因为oracle不需要为每个master_table记录查询一次tableA_tmp，tableB_tmp，tableC_tmp和tableD_tmp。

但是如果tableA_tmp、tableB_tmp、tableC_tmp和tableD_tmp都返回大结果集，SQL 2会快得多，因为连接许多大结果集的成本比为每个master_table记录查询一次tableA_tmp、tableB_tmp、tableC_tmp和tableD_tmp要高得多。

如果我有任何误解，请纠正我。或者有其他建议的方法吗？

SQL1：

select 
        master_table.* ,
        tableA_tmp.cnt as tableA_cnt , 
        tableB_tmp.cnt as tableB_cnt ,  
        tableC_tmp.cnt as tableC_cnt ,  
        tableD_tmp.cnt as tableD_cnt 
    from
        master_table,
        (select  tableA.id as id, count(1) as cnt  from tableA group by tableA.id) tableA_tmp,
        (select  tableB.id as id, count(1) as cnt from tableB group by tableB.id) tableB_tmp,
        (select  tableC.id as id, count(1) as cnt from tableC group by tableC.id) tableC_tmp,
        (select  tableD.id as id, count(1) as cnt from tableD group by tableD.id) tableD_tmp
    where 
        master_table.id = tableA_tmp.id(+) and
        master_table.id = tableB_tmp.id(+) and
        master_table.id = tableC_tmp.id(+) and
        master_table.id = tableD_tmp.id(+) ;

SQL 2：

   select 
        master_table.* ,
        (select  count(*)  from tableA where tableA.id = master_table.id) as tableA_cnt,
        (select  count(*)  from tableB where tableB.id = master_table.id) as tableB_cnt,
        (select  count(*)  from tableC where tableC.id = master_table.id) as tableC_cnt,
        (select  count(*)  from tableD where tableD.id = master_table.id) as tableD_cnt
    from
        master_table;

原文

Suppose master_table contains many records and both the "id" field of the master_table, tableA,tableB,tableC and tableD are the same in the business sense.

For the 2 select statements shown belows ,

Will they both return the same result set?

Which one will have better performance ?

I think if both tableA_tmp ,tableB_tmp,tableC_tmp and tableD_tmp return a smaller result set , SQL1 will be faster than SQL2 because oracle does not need to query tableA_tmp,,tableB_tmp,tableC_tmp and tableD_tmp once for every master_table record.

But if both the tableA_tmp ,tableB_tmp,tableC_tmp and tableD_tmp return the large result set , SQL 2 will be much faster because the cost of joining many large result set is much higher than query tableA_tmp,,tableB_tmp,tableC_tmp and tableD_tmp once for every master_table record.

Please correct me if I have any misunderstanding. Or any others method suggested?

SQL1:

select 
        master_table.* ,
        tableA_tmp.cnt as tableA_cnt , 
        tableB_tmp.cnt as tableB_cnt ,  
        tableC_tmp.cnt as tableC_cnt ,  
        tableD_tmp.cnt as tableD_cnt 
    from
        master_table,
        (select  tableA.id as id, count(1) as cnt  from tableA group by tableA.id) tableA_tmp,
        (select  tableB.id as id, count(1) as cnt from tableB group by tableB.id) tableB_tmp,
        (select  tableC.id as id, count(1) as cnt from tableC group by tableC.id) tableC_tmp,
        (select  tableD.id as id, count(1) as cnt from tableD group by tableD.id) tableD_tmp
    where 
        master_table.id = tableA_tmp.id(+) and
        master_table.id = tableB_tmp.id(+) and
        master_table.id = tableC_tmp.id(+) and
        master_table.id = tableD_tmp.id(+) ;

SQL 2:

   select 
        master_table.* ,
        (select  count(*)  from tableA where tableA.id = master_table.id) as tableA_cnt,
        (select  count(*)  from tableB where tableB.id = master_table.id) as tableB_cnt,
        (select  count(*)  from tableC where tableC.id = master_table.id) as tableC_cnt,
        (select  count(*)  from tableD where tableD.id = master_table.id) as tableD_cnt
    from
        master_table;

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

追风人 2024-09-23 05:34:21

连接通常比内联查询更好 - 对主查询返回的每一行执行内联查询。

这意味着（1）比（2）更好。至少在 99% 的情况下是这样。

在少数情况下，数据的分布和索引的定义方式可以在使查询执行时间向 2 倾斜（效率更高）方面发挥作用，但这种情况在普通数据库中很少发生。

回复收藏 0 原文

姜生凉生 2024-09-23 05:34:21

查看执行计划

回复收藏 0 原文

拔了角的鹿 2024-09-23 05:34:21

在较短的时间内完成工作的人。

回复收藏 0 原文

最后的乘客 2024-09-23 05:34:21

不同的结果集。
考虑 TableA 是否为空。

在语句 1 中，tableA_tmp 也将为空。使用外连接，将返回一行，但 tableA_tmp.cnt 的值为 null。

在语句 2 中，将执行计数并返回零值。

两者的性能可能更好或更差，具体取决于卷、索引、标量子查询缓存、表聚类因子、内存....

回复收藏 0 原文

独留℉清风醉 2024-09-23 05:34:21

如果Master表和其他表足够大，SQL 1会更快。原因是表扫描。表扫描的成本很高，因为扫描涉及 I/O。第一个查询仅需要对表 tableA、tableB、tablec 和 tableD 中的每个表进行一次扫描。

在 SQL2 中，对于主表中的每个键，将分别扫描表 tableA、tableB、tablec 和 tableD。如果主表有10行，则表tableA将被扫描10次，tableB将被扫描10次，tableC将被扫描10次，依此类推。成本会很高。

例子将说明这一点。假设每个表有 1000 条记录，每条记录都有不同的 id。在第一个查询中，表 tableA、tableB、tablec 和 tableD 将分别扫描一次，并将结果连接起来。生成的行数将会很大，但会在合理的时间内进行过滤。在SQL2中，每个表tableA、tableB、tablec和tableD都会被扫描1000次。那是非常昂贵的。

回复收藏 0 原文

~没有更多了~