Tsql - 在分隔列上执行联接 - 性能和优化问题

发布于 2024-11-07 04:04:09 字数 1452 浏览 0 评论 0原文

我有以下查询(在返回的列中略有简化)。

select Products.Product, Products.ID, Products.Customers
from Products
where Products.orderCompleteDate is null

作为示例,这将返回

productA  1  Bob
productA  1  Jane
productB  2  John,Dave

请注意,客户可以是逗号分隔的列表。我想添加的是“客户位置”列,因此上面变成了

productA  1  Bob        Ireland
productA  1  Jane       Wales
productB  2  John,Dave  Scotland,England

我在下面创建的函数,其中 fn_split 返回每个分隔项的单行。

create FUNCTION [dbo].[GetLocations]  (@CustomerNames Varchar(256) )   

RETURNS @TempLocations table (CustomerLocations varchar(256)) AS begin
declare @NameStr varchar(256)  
declare @temp table(singleLoc varchar(256))

insert into @temp
select CustomerLocation.Location from CustomerLocation
INNER JOIN Customers ON Customers.ID = CustomerLocation.ID
INNER JOIN dbo.fn_Split(@CustomerNames,',') split ON split.Item = Customers.Name

SELECT @NameStr = COALESCE(@NameStr + ',', '') + singleLoc 
FROM @temp 

insert into @TempLocations values (@NameStr)
return
end

并将其应用到原始查询中,如下所示。

select Products.product, Products.ID, Products.Customers, Locations.CustomerLocations
from Products
OUTER APPLY dbo.GetLocations(Products.Customers,',') AS Locations
where Products.orderCompleteDate is null

但是,这非常慢,在只有 2000 行的表上查询大约需要 10 秒(初始查询几乎立即运行)。这表明查询无法优化,并且是逐行生成的。由于这个原因,我远离了标量值函数,并尝试坚持使用表值函数。我的逻辑/代码有什么明显的错误吗?

I have the following (slightly simplified in the columns returned) query.

select Products.Product, Products.ID, Products.Customers
from Products
where Products.orderCompleteDate is null

This would return, as an example

productA  1  Bob
productA  1  Jane
productB  2  John,Dave

Note that Customers can be a comma delimited list. What I want to add, is a column 'Customer Locations', so the above becomes

productA  1  Bob        Ireland
productA  1  Jane       Wales
productB  2  John,Dave  Scotland,England

I created a function below, where fn_split returns a single row per delimited item.

create FUNCTION [dbo].[GetLocations]  (@CustomerNames Varchar(256) )   

RETURNS @TempLocations table (CustomerLocations varchar(256)) AS begin
declare @NameStr varchar(256)  
declare @temp table(singleLoc varchar(256))

insert into @temp
select CustomerLocation.Location from CustomerLocation
INNER JOIN Customers ON Customers.ID = CustomerLocation.ID
INNER JOIN dbo.fn_Split(@CustomerNames,',') split ON split.Item = Customers.Name

SELECT @NameStr = COALESCE(@NameStr + ',', '') + singleLoc 
FROM @temp 

insert into @TempLocations values (@NameStr)
return
end

And applied it to the original query as follows

select Products.product, Products.ID, Products.Customers, Locations.CustomerLocations
from Products
OUTER APPLY dbo.GetLocations(Products.Customers,',') AS Locations
where Products.orderCompleteDate is null

However, this is extremely slow, with the query taking ~10seconds on a table with a mere 2000 rows (initial query runs almost instantly). This suggests that the query was unable to be optimised, and is being generated row by row. I stayed away from scalar value functions for this reason, and tried to stick to table value functions. Is there any glaring fault in my logic/code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ま昔日黯然 2024-11-14 04:04:09

我通常建议基于非标准化表创建一个视图来执行标准化,然后将其用作将来任何查询的基础。不幸的是,我无法识别您当前的 Products 表的 PK,但您希望使用 schemabinding 创建此视图,并希望能够将其转换为 索引视图(根据 PK + 客户名称建立索引)。

查询此视图(使用企业版或 NOEXPAND 选项)应该会提供与规范化表存在相当的性能。

I'd normally suggest creating a view, based on the unnormalized table, that does the normalization, and then use that as the basis for any future queries. Unfortunately, I can't identify a PK for your current Products table, but you'd hopefully create this view using schemabinding, and hopefully be able to turn it into an indexed view (indexing on PK + customer name).

Querying this view (using Enterprise Edition, or the NOEXPAND option) should then give you comparable performance as if the normalized table existed.

埋情葬爱 2024-11-14 04:04:09

一种选择是创建第二个表,用于标准化产品表并使其与插入行时调用 split 函数的触发器保持同步。

优点是你可以获得标准的性能和简单的 SQL 查询

缺点是如果出现任何问题,表可能会不同步(总是可以安排一个作业定期从头开始重建新表)

显然最好的答案是重新设计产品表,但假设这不是您可能会弄乱拆分功能等。

One option would be to create a second table that normalises the product table and keeps it in sync with triggers that call the split function when inserting rows.

Pros are you get standard performance and easy SQL queries

Cons are potential for tables going out of sync should anything go wrong (can always schedule a job to rebuild new table from scratch periodically)

Obviously best answer would be to redesign product table but assume that's not possible for you to be messing with split functions etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文