SQL Server 查询计划差异

发布于 2024-08-10 10:44:47 字数 2264 浏览 5 评论 0原文

当从参数化查询更改为非参数化查询时，我无法理解 SQL Server 中语句的估计查询计划的行为。

我有以下查询：

DECLARE @p0 UniqueIdentifier = '1fc66e37-6eaf-4032-b374-e7b60fbd25ea'
SELECT [t5].[value2] AS [Date], [t5].[value] AS [New]
FROM (
    SELECT COUNT(*) AS [value], [t4].[value] AS [value2]
    FROM (
        SELECT CONVERT(DATE, [t3].[ServerTime]) AS [value]
        FROM (
            SELECT [t0].[CookieID]
            FROM [dbo].[Usage] AS [t0]
            WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = @p0)
            GROUP BY [t0].[CookieID]
            ) AS [t1]
        OUTER APPLY (
            SELECT TOP (1) [t2].[ServerTime]
            FROM [dbo].[Usage] AS [t2]
            WHERE ((([t1].[CookieID] IS NULL) AND ([t2].[CookieID] IS NULL)) 
            OR (([t1].[CookieID] IS NOT NULL) AND ([t2].[CookieID] IS NOT NULL) 
            AND ([t1].[CookieID] = [t2].[CookieID]))) 
            AND ([t2].[CookieID] IS NOT NULL)          
            AND ([t2].[ProductID] = @p0)
            ORDER BY [t2].[ServerTime]
            ) AS [t3]
        ) AS [t4]
    GROUP BY [t4].[value]
    ) AS [t5]
ORDER BY [t5].[value2]

该查询由 Linq2SQL 表达式生成并从 LINQPad 中提取。这会产生一个很好的查询计划（据我所知）并在数据库上执行大约 10 秒。但是，如果我将参数的两次使用替换为精确值，即将两个 '= @p0' 部分替换为 '= '1fc66e37-6eaf-4032-b374-e7b60fbd25ea' ' 我会得到不同的估计查询计划，并且查询现在运行时间要长得多（超过60秒，还没看透）。

为什么执行看似无害的替换会产生效率低得多的查询计划和执行？我已使用“DBCC FreeProcCache”清除了过程缓存，以确保我没有缓存错误的计划，但该行为仍然存在。

我真正的问题是我可以忍受 10 秒的执行时间（至少在很长一段时间内），但我无法忍受 60 秒以上的执行时间。我的查询将（如上所述）由 Linq2SQL 生成，因此它在数据库上执行，这

exec sp_executesql N'
        ...
        WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = @p0)
        ...
        AND ([t2].[ProductID] = @p0)
        ...
       ',N'@p0 uniqueidentifier',@p0='1FC66E37-6EAF-4032-B374-E7B60FBD25EA'

会产生同样糟糕的执行时间（我认为这很奇怪，因为这似乎使用参数化查询。

我不是在寻求建议关于要创建哪些索引等，我只是想了解为什么三个看似相似的查询的查询计划和执行如此不同

编辑：我已经上传了非参数化的执行计划。以及参数化查询以及参数化查询的执行计划（如建议） Heinz），此处

希望它能帮助你帮助我:)

原文

I'm having trouble understanding the behavior of the estimated query plans for my statement in SQL Server when a change from a parameterized query to a non-parameterized query.

I have the following query:

DECLARE @p0 UniqueIdentifier = '1fc66e37-6eaf-4032-b374-e7b60fbd25ea'
SELECT [t5].[value2] AS [Date], [t5].[value] AS [New]
FROM (
    SELECT COUNT(*) AS [value], [t4].[value] AS [value2]
    FROM (
        SELECT CONVERT(DATE, [t3].[ServerTime]) AS [value]
        FROM (
            SELECT [t0].[CookieID]
            FROM [dbo].[Usage] AS [t0]
            WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = @p0)
            GROUP BY [t0].[CookieID]
            ) AS [t1]
        OUTER APPLY (
            SELECT TOP (1) [t2].[ServerTime]
            FROM [dbo].[Usage] AS [t2]
            WHERE ((([t1].[CookieID] IS NULL) AND ([t2].[CookieID] IS NULL)) 
            OR (([t1].[CookieID] IS NOT NULL) AND ([t2].[CookieID] IS NOT NULL) 
            AND ([t1].[CookieID] = [t2].[CookieID]))) 
            AND ([t2].[CookieID] IS NOT NULL)          
            AND ([t2].[ProductID] = @p0)
            ORDER BY [t2].[ServerTime]
            ) AS [t3]
        ) AS [t4]
    GROUP BY [t4].[value]
    ) AS [t5]
ORDER BY [t5].[value2]

This query is generated by a Linq2SQL expression and extracted from LINQPad. This produces a nice query plan (as far as I can tell) and executes in about 10 seconds on the database. However, if I replace the two uses of parameters with the exact value, that is replace the two '= @p0' parts with '= '1fc66e37-6eaf-4032-b374-e7b60fbd25ea' ' I get a different estimated query plan and the query now runs much longer (more than 60 seconds, haven't seen it through).

Why is it that performing the seemingly innocent replacement produces a much less efficient query plan and execution? I have cleared the procedure cache with 'DBCC FreeProcCache' to ensure that I was not caching a bad plan, but the behavior remains.

My real problem is that I can live with the 10 seconds execution time (at least for a good while) but I can't live with the 60+ sec execution time. My query will (as hinted above) by produced by Linq2SQL so it is executed on the database as

exec sp_executesql N'
        ...
        WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = @p0)
        ...
        AND ([t2].[ProductID] = @p0)
        ...
       ',N'@p0 uniqueidentifier',@p0='1FC66E37-6EAF-4032-B374-E7B60FBD25EA'

which produces the same poor execution time (which I think is doubly strange since this seems to be using parameterized queries.

I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.

EDIT: I have uploaded execution plans for the non-parameterized and the parameterized query as well as an execution plan for a parameterized query (as suggested by Heinz) with a different GUID here

Hope it helps you help me :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

￡冰雨忧蓝° 2024-08-17 10:44:47

如果您提供显式值，SQL Server 可以使用该字段的统计信息来做出“更好”的查询计划决策。不幸的是（正如我最近亲身经历的那样），如果统计信息中包含的信息具有误导性，有时 SQL Server 就会做出错误的选择。

如果您想更深入地研究这个问题，我建议您检查一下使用其他 GUID 时会发生什么：如果它对不同的具体 GUID 使用不同的查询计划，则表明使用了统计数据。在这种情况下，您可能需要查看 sp_updatestats 和相关命令。

编辑：查看DBCC SHOW_STATISTICS：“慢”和“快”GUID 可能位于直方图中的不同桶中。我遇到了类似的问题< /a>，我通过添加 INDEX 解决了这个问题SQL 的表提示，它“指导”SQL Server 找到“正确”的查询计划。基本上，我研究了“快速”查询期间使用的索引，并将它们硬编码到 SQL 中。这远非最佳或优雅的解决方案，但我还没有找到更好的解决方案......

回复收藏 0 原文

木落 2024-08-17 10:44:47

我并不是在寻求有关创建哪些索引等的建议，我只是想了解为什么三个看似相似的查询的查询计划和执行如此不同。

您似乎有两个索引：

IX_NonCluster_Config (ProductID, ServerTime)
IX_NonCluster_ProductID_CookieID_With_ServerTime (ProductID, CookieID) INCLUDE (ServerTime)

第一个索引不覆盖 CookieID 但按 ServerTime 排序，因此对于 less 选择性 < code>ProductID 的（即您有很多的）

第二个索引确实涵盖了所有列，但未排序，因此对于更多选择性 ProductID（那些你很少的）。

平均而言，ProductID 基数使得 SQL Server 期望第二种方法高效，这就是当您使用参数化查询或显式提供选择性 时使用的方法GUID 的。

但是，您的原始 GUID 被认为选择性较低，这就是使用第一种方法的原因。

不幸的是，第一种方法需要对 CookieID 进行额外的过滤，这就是它实际上效率较低的原因。

I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.

You seem to have two indexes:

IX_NonCluster_Config (ProductID, ServerTime)
IX_NonCluster_ProductID_CookieID_With_ServerTime (ProductID, CookieID) INCLUDE (ServerTime)

The first index does not cover CookieID but is ordered on ServerTime and hence is more efficient for the less selective ProductID's (i. e. those that you have many)

The second index does cover all columns but is not ordered, and hence is more efficient for more selective ProductID's (those that you have few).

In average, you ProductID cardinality is so that SQL Server expects the second method to be efficient, which is what it uses when you use parametrized queries or explicitly provide selective GUID's.

However, your original GUID is considered less selective, that's why the first method is used.

Unfortunately, the first method requires additional filtering on CookieID which is why it's less efficient in fact.

回复收藏 0 原文

夜光 2024-08-17 10:44:47

我的猜测是，当您采用非参数化路线时，您的 guid 必须从 varchar 转换为 UniqueIdentifier，这可能会导致索引不被使用，而它将在采用参数化路线时使用。

我已经看到这种情况发生在使用在 where 子句中针对使用日期时间的列具有小日期时间的查询时。

回复收藏 0 原文

同尘 2024-08-17 10:44:47

如果不查看执行计划就很难判断，但是如果我要猜测一个原因，我会说它是参数嗅探和糟糕的统计数据的结合 - 如果你很难 -将 GUID 编码到查询中，查询优化器尝试针对该参数值优化查询。我相信参数化/准备好的查询也会发生同样的事情（这称为参数嗅探 - 执行计划针对第一次执行准备好的语句时使用的参数进行了优化），但是当您声明时这绝对不会发生参数并在查询中使用它。

正如我所说，SQL Server 尝试针对该值优化执行计划，因此通常您应该会看到更好的结果。在这里，它所依据的决策信息似乎是不正确/具有误导性的，并且当它优化通用参数值的查询时，您会更好（出于某种原因）。

然而，这主要是猜测 - 如果不执行就不可能真正判断 - 如果您可以将执行计划上传到某个地方，那么我相信有人能够帮助您了解真正的原因。

回复收藏 0 原文

~没有更多了~