如何解决 SQL Server - 内联表值函数执行计划基于参数的变化?
情况如下:
我有一个带有日期时间参数的表值函数,免得说 tdf(p_date) , 过滤大约 200 万行,选择列日期小于 p_date 的行,并计算其他列上的一些聚合值。
它工作得很好,但如果 p_date 是自定义标量值函数(在我的例子中返回一天的结束时间),则执行计划会发生变化,查询执行时间从 1 秒变为 1 分钟。
概念证明表 - 1K 个产品,2M 行:
CREATE TABLE [dbo].[POC](
[Date] [datetime] NOT NULL,
[idProduct] [int] NOT NULL,
[Quantity] [int] NOT NULL
) ON [PRIMARY]
内联表值函数:
CREATE FUNCTION tdf (@p_date datetime)
RETURNS TABLE
AS
RETURN
(
SELECT idProduct, SUM(Quantity) AS TotalQuantity,
max(Date) as LastDate
FROM POC
WHERE (Date < @p_date)
GROUP BY idProduct
)
标量值函数:
CREATE FUNCTION [dbo].[EndOfDay] (@date datetime)
RETURNS datetime
AS
BEGIN
DECLARE @res datetime
SET @res=dateadd(second, -1,
dateadd(day, 1,
dateadd(ms, -datepart(ms, @date),
dateadd(ss, -datepart(ss, @date),
dateadd(mi,- datepart(mi,@date),
dateadd(hh, -datepart(hh, @date), @date))))))
RETURN @res
END
查询 1 - 运行良好
SELECT * FROM [dbo].[tdf] (getdate())
执行计划结束: 流聚合成本 13% <--- 聚集索引扫描成本 86%
查询 2 - 不太好
SELECT * FROM [dbo].[tdf] (dbo.EndOfDay(getdate()))
执行计划结束: 流聚合成本 4% <--- 过滤器成本 12% <--- 聚集索引扫描成本 86%
Here is the situation:
I have a table value function with a datetime parameter ,lest's say tdf(p_date) ,
that filters about two million rows selecting those with column date smaller than p_date and computes some aggregate values on other columns.
It works great but if p_date is a custom scalar value function (returning the end of day in my case) the execution plan is altered an the query goes from 1 sec to 1 minute execution time.
A proof of concept table - 1K products, 2M rows:
CREATE TABLE [dbo].[POC](
[Date] [datetime] NOT NULL,
[idProduct] [int] NOT NULL,
[Quantity] [int] NOT NULL
) ON [PRIMARY]
The inline table value function:
CREATE FUNCTION tdf (@p_date datetime)
RETURNS TABLE
AS
RETURN
(
SELECT idProduct, SUM(Quantity) AS TotalQuantity,
max(Date) as LastDate
FROM POC
WHERE (Date < @p_date)
GROUP BY idProduct
)
The scalar value function:
CREATE FUNCTION [dbo].[EndOfDay] (@date datetime)
RETURNS datetime
AS
BEGIN
DECLARE @res datetime
SET @res=dateadd(second, -1,
dateadd(day, 1,
dateadd(ms, -datepart(ms, @date),
dateadd(ss, -datepart(ss, @date),
dateadd(mi,- datepart(mi,@date),
dateadd(hh, -datepart(hh, @date), @date))))))
RETURN @res
END
Query 1 - Working great
SELECT * FROM [dbo].[tdf] (getdate())
The end of execution plan:
Stream Aggregate Cost 13% <--- Clustered Index Scan Cost 86%
Query 2 - Not so great
SELECT * FROM [dbo].[tdf] (dbo.EndOfDay(getdate()))
The end of execution plan:
Stream Aggregate Cost 4% <--- Filter Cost 12% <--- Clustered Index Scan Cost 86%
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
开销是你的标量函数。
这里的 TVF 像内联宏一样展开,因此
变为
当您使用日终标量函数时,SQL 无法将 EOD(GETDATE()) 计算为常量。
抱歉,我无法快速找到有关 SQL 如何评估这些内容的文章。
我猜想它是针对每一行进行评估的,而不是按照您想要的那样预先进行评估。
我将单独计算 EOD 语句:
我还将其用于 EOD 函数:
编辑:我回答的其他问题
The overhead is your scalar function.
The TVF here is expanded like an inline macro so
becomes
When you use end of day scalar function, SQL can not evaluate the EOD(GETDATE()) as a constant.
I can't find my article quickly on how SQL evaluates this stuff, sorry.
I guess that it's being evaluated for each row, not upfront as you want.
I'd calulate the EOD statement separately:
I'd also use this for the EOD function:
EDIT: Other question I answered
您也可以将 EndOfDay 重写为内联 UDF,并使用嵌套内联 UDF。 示例:
许多嵌套内联 UDF 速度非常快
使用内联 UDF 计算该月的第三个星期三
You can rewrite EndOfDay as an inline UDF too, and use nested inline UDFs. Examples:
Many nested inline UDFs are very fast
Calculating third Wednesday of the month with inline UDFs