SQL引擎按什么顺序执行查询和子查询?
你好,我做了一个 SQL 测试,对一个问题感到怀疑/好奇:
SQL 引擎执行查询和子查询的顺序是什么?
答案是
- 主查询 ->子查询-> sub 子查询 等等
- sub 子查询 ->子查询-> prime query
- 整个查询一次性解释
- 没有固定的解释顺序,查询解析器即时做出决定
我选择了最后一个答案(只是假设它相对于其他答案是最可靠的)。 现在好奇的是:
我在哪里可以读到这方面的内容,并简要介绍一下这一切背后的机制是什么?
谢谢。
Hello I made a SQL test and dubious/curious about one question:
In which sequence are queries and sub-queries executed by the SQL engine?
the answers was
- primary query -> sub query -> sub sub query and so on
- sub sub query -> sub query -> prime query
- the whole query is interpreted at one time
- There is no fixed sequence of interpretation, the query parser takes a decision on fly
I choosed the last answer (just supposing that it is most reliable w.r.t. others).
Now the curiosity:
where can i read about this and briefly what is the mechanism under all of that?
Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我认为答案4是正确的。有几个注意事项:
子查询的类型 - 是否相关。考虑:
这里,子查询与外部查询不相关。如果 t2.id 中的值数量与 t1.id 相比较小,则首先执行子查询并将结果保留在内存中,然后扫描 t1 或 t1.id 上的索引,进行匹配可能是最有效的缓存的值。
但如果查询是:
这里子查询是相关的 - 除非 t1.type 已知,否则无法计算子查询。由于 t1.type 的值对于外部查询的每一行可能有所不同,因此可以针对外部查询的每一行执行该子查询一次。
话又说回来,RDBMS 可能真的很聪明,并意识到 t2.type 只有几个可能的值。在这种情况下,如果它可以猜测执行一次子查询的成本比为每行执行一次子查询更便宜,那么它仍然可以使用用于不相关子查询的方法。
I think answer 4 is correct. There are a few considerations:
type of subquery - is it corrrelated, or not. Consider:
Here, the subquery is not correlated to the outer query. If the number of values in t2.id is small in comparison to t1.id, it is probably most efficient to first execute the subquery, and keep the result in memory, and then scan t1 or an index on t1.id, matching against the cached values.
But if the query is:
here the subquery is correlated - there is no way to compute the subquery unless t1.type is known. Since the value for t1.type may vary for each row of the outer query, this subquery could be executed once for each row of the outer query.
Then again, the RDBMS may be really smart and realize there are only a few possible values for t2.type. In that case, it may still use the approach used for the uncorrelated subquery if it can guess that the cost of executing the subquery once will be cheaper that doing it for each row.
选项 4 很接近。
SQL 是声明式:你告诉查询优化器你想要什么,它会得到最好的结果(取决于时间/“成本”等)的方式。对于外观相同的查询和表,这可能会有所不同,具体取决于统计数据、数据分布、行计数、并行性以及上帝知道的其他因素。
这意味着没有固定的顺序。但它并不是完全“即时”
即使使用相同的服务器、模式、查询和数据,我也发现执行计划有所不同
Option 4 is close.
SQL is declarative: you tell the query optimiser what you want and it works out the best (subject to time/"cost" etc) way of doing it. This may vary for outwardly identical queries and tables depending on statistics, data distribution, row counts, parallelism and god knows what else.
This means there is no fixed order. But it's not quite "on the fly"
Even with identical servers, schema, queries, and data I've seen execution plans differ
SQL 引擎尝试优化(子)查询的执行顺序。决定这一点的部分称为查询优化器。查询优化器知道每个表中有多少行、哪些表有索引以及哪些字段有索引。它使用该信息来决定首先执行哪一部分。
The SQL engine tries to optimise the order in which (sub)queries are executed. The part deciding about that is called a query optimizer. The query optimizer knows how many rows are in each table, which tables have indexes and on what fields. It uses that information to decide what part to execute first.
如果您想阅读有关这些主题的内容,请获取《Inside SQL Server 2008:T-SQL 查询》的副本。它有两个专门的章节介绍 SQL Server 中如何逻辑和物理处理查询。
If you want something to read up on these topics, get a copy of Inside SQL Server 2008: T-SQL Querying. It has two dedicated chapters on how queries are processed logically and physically in SQL Server.
这通常取决于您的 DBMS,但是......我认为第二个答案更合理。
如果没有子查询结果,主查询通常无法计算。
It's usually depends from your DBMS, but ... I think second answer is more plausible.
Prime query usually can't be calculated without sub query results.