连接顺序如何影响查询性能
我在查询中遇到了时间性能的巨大差异,并且查询中连接(内部和左外部)发生的顺序似乎造成了所有的差异。 是否有一些“基本规则”,应按什么顺序加入?
它们都是更大查询的一部分。 它们之间的区别在于左连接在更快的查询中放在最后。
慢速查询:(>10分钟)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
较快查询:(~30秒)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
I'm experiencing big differences in timeperformance in my query, and it seems the order of which the joins (inner and left outer) occur in the query makes all the difference.
Are there some "ground rules" in what order joins should be in?
Both of them are part of a bigger query.
The difference between them is that the left join is placed last in the faster query.
Slow query: (> 10 minutes)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
Faster query: (~ 30 seconds)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一般来说,INNER JOIN 顺序并不重要,因为内连接是可交换的和关联的。在这两种情况下,您仍然有 t0 内部联接 t4,因此应该没有区别。
换句话来说,SQL 是声明性的:你说“你想要什么”,而不是“如何”。优化器会“如何”工作,并根据需要重新排序 JOIN,在实践中也会查找 WHERE 等。
在复杂的查询中,基于成本的查询优化器不会耗尽所有排列,因此它有时可能很重要。
因此,我会检查这些:
请参阅其他一些问题:
Generally INNER JOIN order won't matter because inner joins are commutative and associative. In both cases, you still have
t0 inner join t4
so should make no difference.Re-phrasing that, SQL is declarative: you say "what you want", not "how". The optimiser works the "how" and will re-order JOINs as needed, looking as WHEREs etc too in practice.
In complex queries, a cost based query optimiser won't exhaust all permutation so it could matter occasionally.
So, I'd check for these:
See some other SO questions:
如果你有超过 2 个表,那么排序表连接就很重要。它可以产生很大的差异。第一个表应该得到一个主要提示。第一个表是具有最多选择性行的对象。例如:如果您有一个包含 1.000.000 人的成员表,并且您只想选择女性,并且它是第一个表,那么您只需将 500.000 条记录连接到下一个表。如果该表位于连接顺序的末尾(可能是表 4,5 或 6),则将连接每条记录(最坏情况为 1.000.000)。这包括内部连接和外部连接。
规则:从最具选择性的表开始,然后加入下一个逻辑上最具选择性的表。
功能转换和美化应该放在最后。有时最好
将完整的 SQL 捆绑在括号中,并在外部选择语句中使用表达式和函数。
If u have more than 2 tables it is important to order table joins. It can make big differences. First table should get a leading hint. First table is that object with most selective rows. For example: If u have a member table with 1.000.000 people and you only want to select female gender and it is first table, so you only join 500.000 records to next table. If this table is at the end of join order (maybe table 4,5 or 6) then each record (worst case 1.000.000) will be joined. This includes inner and outer joins.
The Rule: Start with most selective table, then join next logical most selective table.
Converting functions and beautifying should do last. Sometimes it is better to
bundle the shole SQL in brackets and use expressions and functions in outer select statements.
在左连接的情况下,它会对性能产生很大影响。我在选择查询中遇到了这样的问题:
查询需要 13 到 15 秒才能执行,当我更改顺序时,它需要 1 到 2 秒。
就我而言,我更改顺序,以防当我加载表时,我使用在后面的联接中使用该表的所有联接,而不是将其加载到另一个块中。就像在我的 p0_ 表中一样,我在前 4 行中进行了所有左连接,这与第一个代码中不同。
PS:为了测试我在 postgre 中的性能,我使用这个网站: http://tatiyants.com/pev /#/计划/新
In the case of left join it impact a lot the performance. i was having a problem in a select query that was like that :
the query takes 13 to 15 seconde to execute, when i change the order its takes 1 to 2 seconde.
in my case i change the order in case when i load a table i use all the join that use this table in the join that follow and not to load it in another block. like in my p0_ table i made all the left join in the first 4 lines not like in the first code.
PS: to test my perf in postgre i use this website: http://tatiyants.com/pev/#/plans/new
至少在 SQLite 中,我发现它有很大的不同。实际上,不需要非常复杂的查询就能显示出差异。然而,我的 JOIN 语句位于嵌入子句内。
基本上,正如克里斯蒂安所指出的,您应该首先从最具体的限制开始。
At least in SQLite, I found out that it makes a huge difference. Actually it didn't need to be a very complex query for the difference to show itself. My JOIN statements were inside an embedded clause however.
Basically, you should start with the most specific limitations first, as Christian has pointed out.