CTE 到底如何发挥作用?
我遇到了 这个 CTE 解决方案用于连接行元素,我认为这太棒了,我意识到 CTE 有多么强大。
然而,为了有效地使用这样的工具,我需要知道它如何在内部工作以构建心理图像,这对于像我这样的初学者在不同场景中使用它至关重要。
因此,我尝试放慢上述代码片段的过程,这是代码
USE [NORTHWIND]
GO
/****** Object: Table [dbo].[Products2] Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[ProductName] [nvarchar](40) NOT NULL,
[SupplierID] [int] NULL,
[CategoryID] [int] NULL,
[QuantityPerUnit] [nvarchar](20) NULL,
[UnitPrice] [money] NULL,
[UnitsInStock] [smallint] NULL,
[UnitsOnOrder] [smallint] NULL,
[ReorderLevel] [smallint] NULL,
[Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
DECLARE @I INT=0
WHILE @I<99999
BEGIN
SELECT @I+=1
END
RETURN GETDATE()
END
GO
WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
AS (SELECT dbo.DELAY_EXEC(),
CategoryID,
CAST('' AS VARCHAR(8000)),
CAST('' AS VARCHAR(8000)),
0
FROM Northwind..Products2
GROUP BY CategoryID
UNION ALL
SELECT dbo.DELAY_EXEC(),
p.CategoryID,
CAST(product_list + CASE
WHEN length = 0 THEN ''
ELSE ', '
END + ProductName AS VARCHAR(8000)),
CAST(ProductName AS VARCHAR(8000)),
length + 1
FROM CTE c
INNER JOIN Northwind..Products2 p
ON c.CategoryID = p.CategoryID
WHERE p.ProductName > c.product_name)
SELECT *
FROM CTE
ORDER BY EXEC_TIME
--SELECT CategoryId, product_list
-- FROM ( SELECT CategoryId, product_list,
-- RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
-- FROM CTE ) D ( CategoryId, product_list, rank )
-- WHERE rank = 1 ;
注释块是串联问题所需的输出,但这不是这里的问题。
我添加了一列 EXEC_TIME 以了解首先添加哪一行。 由于两个原因,输出对我来说看起来不正确
我认为由于条件
p.ProductName > > 会出现冗余数据c.product_name
换句话说,CTE 的第一部分的空行始终小于 Product2 表中的值,因此每次运行时,它都应该再次带来一组已添加的新行。这有什么意义吗?数据的层次结构真的很奇怪,最后一项应该是最长的,看看最后一项是什么?具有
length=1
的项目?
有高手来帮忙吗?提前致谢。
结果示例
EXEC_TIME CategoryID product_list product_name length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1 0
2011-10-18 12:46:14.990 2 0
2011-10-18 12:46:15.050 4 0
2011-10-18 12:46:15.107 4 vcbcbvcbvc vcbcbvcbvc 1
2011-10-18 12:46:15.167 2 Aniseed Syrup Aniseed Syrup 1
2011-10-18 12:46:15.223 2 Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 1
2011-10-18 12:46:15.280 2 Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 1
2011-10-18 12:46:15.340 2 Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.400 2 Aniseed Syrup, Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 2
2011-10-18 12:46:15.463 2 Aniseed Syrup, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.520 2 Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi Chef Anton's Gumbo Mix 3
2011-10-18 12:46:15.580 1 Changassad Changassad 1
I came across this CTE solution for concatenating row elements and I thought it's brilliant and I realized how powerful CTEs can be.
However, in order to use such a tool effectively I need to know how it works internally to build that mental image which is essential for beginners, like me, to use it in different scenarios.
So I tried to slow motion the process of the above snippet and here is the code
USE [NORTHWIND]
GO
/****** Object: Table [dbo].[Products2] Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[ProductName] [nvarchar](40) NOT NULL,
[SupplierID] [int] NULL,
[CategoryID] [int] NULL,
[QuantityPerUnit] [nvarchar](20) NULL,
[UnitPrice] [money] NULL,
[UnitsInStock] [smallint] NULL,
[UnitsOnOrder] [smallint] NULL,
[ReorderLevel] [smallint] NULL,
[Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
DECLARE @I INT=0
WHILE @I<99999
BEGIN
SELECT @I+=1
END
RETURN GETDATE()
END
GO
WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
AS (SELECT dbo.DELAY_EXEC(),
CategoryID,
CAST('' AS VARCHAR(8000)),
CAST('' AS VARCHAR(8000)),
0
FROM Northwind..Products2
GROUP BY CategoryID
UNION ALL
SELECT dbo.DELAY_EXEC(),
p.CategoryID,
CAST(product_list + CASE
WHEN length = 0 THEN ''
ELSE ', '
END + ProductName AS VARCHAR(8000)),
CAST(ProductName AS VARCHAR(8000)),
length + 1
FROM CTE c
INNER JOIN Northwind..Products2 p
ON c.CategoryID = p.CategoryID
WHERE p.ProductName > c.product_name)
SELECT *
FROM CTE
ORDER BY EXEC_TIME
--SELECT CategoryId, product_list
-- FROM ( SELECT CategoryId, product_list,
-- RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
-- FROM CTE ) D ( CategoryId, product_list, rank )
-- WHERE rank = 1 ;
The commented block is the desired output for the concatenation problem but it's not the question here.
I've added a column EXEC_TIME to know which row got added first.
The output doesn’t look right to me for two reasons
I thought there would be a redundant data because of the condition
p.ProductName > c.product_name
in another word the first part of the CTE the empty rows are always less then values in the Product2 table so each time it runs it should bring a new set of already added rows once again. Does this make any sense?The hierarchy of data is really weird the last item should be the longest and look what is the last item? An item with
length=1
?
Any expert to the rescue? Thanks in advance.
Sample Results
EXEC_TIME CategoryID product_list product_name length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1 0
2011-10-18 12:46:14.990 2 0
2011-10-18 12:46:15.050 4 0
2011-10-18 12:46:15.107 4 vcbcbvcbvc vcbcbvcbvc 1
2011-10-18 12:46:15.167 2 Aniseed Syrup Aniseed Syrup 1
2011-10-18 12:46:15.223 2 Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 1
2011-10-18 12:46:15.280 2 Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 1
2011-10-18 12:46:15.340 2 Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.400 2 Aniseed Syrup, Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 2
2011-10-18 12:46:15.463 2 Aniseed Syrup, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.520 2 Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi Chef Anton's Gumbo Mix 3
2011-10-18 12:46:15.580 1 Changassad Changassad 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个有趣的问题,它也帮助我更好地理解递归 CTE。
如果您查看执行计划,您将看到使用了一个假脱机,并且它设置了
WITH STACK
属性。这意味着 行是以类似堆栈的方式读取(后进先出),因此首先运行锚点部分
,然后处理
4
,因为这是添加的最后一行。JOIN
返回添加到假脱机中的 1 行,然后处理这个新添加的行。在这种情况下,Join 不会返回任何内容,因此不会向假脱机中添加任何其他内容,并且会继续处理CategoryID = 2
行。这将返回 3 行,这些行将被添加到假脱机中
,然后以类似的 LIFO 方式依次处理每一行,首先处理添加的任何子行,然后才能继续处理同级行。希望您能够了解此递归逻辑如何解释您观察到的结果,但以防万一您无法进行
C#
模拟返回
This is an interesting question that helped me better understand recursive CTEs too.
If you look at the execution plan you will see that a spool is used and that it has the
WITH STACK
property set. Which means that rows are read in a stack-like manner (Last In First Out)So first the anchor part runs
Then
4
is processed as that is the last row added. TheJOIN
returns 1 row that is added to the spool then this newly added row is processed. In this case the Join returns nothing so there is nothing additional added to the spool and it moves on to processing theCategoryID = 2
row.This returns 3 rows which are added to the spool
then each of these rows are processed in turn in a similar LIFO fashion with any child rows added being dealt with first before processing can move on to the sibling rows. Hopefully you can see how this recursive logic explains your observed results but just in case you can't a
C#
simulationReturns
使用公用表表达式的递归查询页面描述了 CTE 的逻辑:
然而,这只是逻辑流程。与往常一样,使用 SQL,如果结果“相同”,服务器可以根据需要自由地重新排序操作,并且认为重新排序可以更有效地提供结果。
在决定是否对操作重新排序时,通常不会考虑具有副作用的函数的存在(导致延迟,然后返回
GETDATE()
)。重新排序查询的一种明显方式是,它可能决定在完全创建结果集
Ti
之前开始处理结果集Ti+1
- 它可能是这样做比首先完全构造 Ti 更有效,因为新行肯定已经在内存中并且最近已被访问。The page Recursive Queries Using Common Table Expressions describes the logic of CTEs:
However, that's only the logical flow. As always, with SQL, the server is free to reorder operations as it sees fit, if the result will be "the same", and the reordering is perceived to provide the results more efficiently.
The presence of your function with side effects (causing a delay, then returning
GETDATE()
) isn't something that would normally be considered when deciding whether to reorder operations.One obvious way in which the query may be reordered is that it may decide to start working on result set
Ti+1
before it has fully created result setTi
- it may be more efficient to do this than to fully constructTi
first, since the new rows are definitely already in memory and have been accessed recently.