SQL Server 2005 递归查询与数据循环 - 可能吗?

发布于 2024-07-28 14:21:05 字数 842 浏览 5 评论 0原文

我有一个标准的老板/下属员工表。 我需要选择一个老板(由 ID 指定)和他的所有下属(及其下属等)。 不幸的是,现实世界的数据存在一些循环(例如,两个公司所有者都将对方设置为自己的老板)。 带有 CTE 的简单递归查询对此感到窒息(超出了最大递归级别 100)。 员工还能选拔吗? 我不关心它们被选择的顺序,只关心它们每个被选择一次。


Added: You want my query? Umm... OK... I though it is pretty obvious, but - here it is:

with
UserTbl as -- Selects an employee and his subordinates.
(
    select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = @UserID
    union all
    select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
)
select * from UserTbl


Added 2: Oh, in case it wasn't clear - this is a production system and I have to do a little upgrade (basically add a sort of report). Thus, I'd prefer not to modify the data if it can be avoided.

I've got a standard boss/subordinate employee table. I need to select a boss (specified by ID) and all his subordinates (and their subrodinates, etc). Unfortunately the real world data has some loops in it (for example, both company owners have each other set as their boss). The simple recursive query with a CTE chokes on this (maximum recursion level of 100 exceeded). Can the employees still be selected? I care not of the order in which they are selected, just that each of them is selected once.


Added: You want my query? Umm... OK... I though it is pretty obvious, but - here it is:

with
UserTbl as -- Selects an employee and his subordinates.
(
    select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = @UserID
    union all
    select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
)
select * from UserTbl


Added 2: Oh, in case it wasn't clear - this is a production system and I have to do a little upgrade (basically add a sort of report). Thus, I'd prefer not to modify the data if it can be avoided.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

烂人 2024-08-04 14:40:01

这是我在一个项目中使用的代码,用于上下追踪层次关系树。

用户定义的函数来捕获下属:

CREATE FUNCTION fn_UserSubordinates(@User_ID INT)
RETURNS @SubordinateUsers TABLE (User_ID INT, Distance INT) AS BEGIN
    IF @User_ID IS NULL
        RETURN

    INSERT INTO @SubordinateUsers (User_ID, Distance) VALUES ( @User_ID, 0)

    DECLARE @Distance INT, @Finished BIT
    SELECT @Distance = 1, @Finished = 0

    WHILE @Finished = 0
    BEGIN
        INSERT INTO @SubordinateUsers
            SELECT S.User_ID, @Distance
                FROM Users AS S
                JOIN @SubordinateUsers AS C
                    ON C.User_ID = S.Manager_ID
                LEFT JOIN @SubordinateUsers AS C2
                    ON C2.User_ID = S.User_ID
                WHERE C2.User_ID IS NULL
        IF @@RowCount = 0
            SET @Finished = 1

        SET @Distance = @Distance + 1
    END

    RETURN
END

用户定义的函数来捕获经理:

CREATE FUNCTION fn_UserManagers(@User_ID INT)
RETURNS @User TABLE (User_ID INT, Distance INT) AS BEGIN
    IF @User_ID IS NULL
        RETURN

    DECLARE @Manager_ID INT

    SELECT @Manager_ID = Manager_ID
    FROM UserClasses WITH (NOLOCK)
    WHERE User_ID = @User_ID

    INSERT INTO @UserClasses (User_ID, Distance)
        SELECT User_ID, Distance + 1
        FROM dbo.fn_UserManagers(@Manager_ID)

    INSERT INTO @User (User_ID, Distance) VALUES (@User_ID, 0)

    RETURN
END

This is the code I used on a project to chase up and down hierarchical relationship trees.

User defined function to capture subordinates:

CREATE FUNCTION fn_UserSubordinates(@User_ID INT)
RETURNS @SubordinateUsers TABLE (User_ID INT, Distance INT) AS BEGIN
    IF @User_ID IS NULL
        RETURN

    INSERT INTO @SubordinateUsers (User_ID, Distance) VALUES ( @User_ID, 0)

    DECLARE @Distance INT, @Finished BIT
    SELECT @Distance = 1, @Finished = 0

    WHILE @Finished = 0
    BEGIN
        INSERT INTO @SubordinateUsers
            SELECT S.User_ID, @Distance
                FROM Users AS S
                JOIN @SubordinateUsers AS C
                    ON C.User_ID = S.Manager_ID
                LEFT JOIN @SubordinateUsers AS C2
                    ON C2.User_ID = S.User_ID
                WHERE C2.User_ID IS NULL
        IF @@RowCount = 0
            SET @Finished = 1

        SET @Distance = @Distance + 1
    END

    RETURN
END

User defined function to capture managers:

CREATE FUNCTION fn_UserManagers(@User_ID INT)
RETURNS @User TABLE (User_ID INT, Distance INT) AS BEGIN
    IF @User_ID IS NULL
        RETURN

    DECLARE @Manager_ID INT

    SELECT @Manager_ID = Manager_ID
    FROM UserClasses WITH (NOLOCK)
    WHERE User_ID = @User_ID

    INSERT INTO @UserClasses (User_ID, Distance)
        SELECT User_ID, Distance + 1
        FROM dbo.fn_UserManagers(@Manager_ID)

    INSERT INTO @User (User_ID, Distance) VALUES (@User_ID, 0)

    RETURN
END
就此别过 2024-08-04 14:37:31

更好的解决方案是清理数据并确保将来不会出现任何循环 - 这可以通过触发器或包含在检查约束中的 UDF 来完成。

但是,您可以使用多语句 UDF,如我在此处演示的: 避免无限循环。 第一部分

您可以在连接中添加 NOT IN() 子句来过滤掉循环。

The preferrable solution is to clean up the data and to make sure you do not have any loops in the future - that can be accomplished with a trigger or a UDF wrapped in a check constraint.

However, you can use a multi statement UDF as I demonstrated here: Avoiding infinite loops. Part One

You can add a NOT IN() clause in the join to filter out the cycles.

轮廓§ 2024-08-04 14:35:13

我可以想到两种方法。

1)生成比您想要的更多的行,但要进行检查以确保它不会递归得太深。 然后删除重复的用户记录。

2)使用字符串来保存已经访问过的用户。 就像不在子查询中的想法不起作用一样。

方法 1:

; with TooMuchHierarchy as (
    select "User_ID"
        , Manager_ID 
        , 0 as Depth
    from "User" 
    WHERE "User_ID" = @UserID
    union all
    select U."User_ID"
        , U.Manager_ID
        , M.Depth + 1 as Depth
    from TooMuchHierarchy M
    inner join "User" U 
        on U.Manager_ID = M."user_id"
    where Depth < 100) -- Warning MAGIC NUMBER!!
, AddMaxDepth as (
    select "User_ID"
        , Manager_id
        , Depth
        , max(depth) over (partition by "User_ID") as MaxDepth
    from TooMuchHierarchy)
select "user_id", Manager_Id 
from AddMaxDepth
where Depth = MaxDepth

where Depth < 100 可以防止出现最大递归错误。 减小这个数字,就会产生更少需要丢弃的记录。 如果它太小,员工将不会被返回,因此请确保它至少与所存储的组织结构图的深度一样大。 随着公司的发展,维护有点像噩梦。 如果需要更大,则将选项(maxrecursion ... number ...)添加到整个事物中以允许更多递归。

方法二:

; with Hierarchy as (
    select "User_ID"
        , Manager_ID 
        , '#' + cast("user_id" as varchar(max)) + '#' as user_id_list
    from "User" 
    WHERE "User_ID" = @UserID
    union all
    select U."User_ID"
        , U.Manager_ID
        , M.user_id_list + '#' + cast(U."user_id" as varchar(max)) + '#' as user_id_list
    from Hierarchy M
    inner join "User" U 
        on U.Manager_ID = M."user_id"
    where user_id_list not like '%#' + cast(U."User_id" as varchar(max)) + '#%')
select "user_id", Manager_Id 
from Hierarchy

I can think of two approaches.

1) Produce more rows than you want, but include a check to make sure it does not recurse too deep. Then remove duplicate User records.

2) Use a string to hold the Users already visited. Like the not in subquery idea that didn't work.

Approach 1:

; with TooMuchHierarchy as (
    select "User_ID"
        , Manager_ID 
        , 0 as Depth
    from "User" 
    WHERE "User_ID" = @UserID
    union all
    select U."User_ID"
        , U.Manager_ID
        , M.Depth + 1 as Depth
    from TooMuchHierarchy M
    inner join "User" U 
        on U.Manager_ID = M."user_id"
    where Depth < 100) -- Warning MAGIC NUMBER!!
, AddMaxDepth as (
    select "User_ID"
        , Manager_id
        , Depth
        , max(depth) over (partition by "User_ID") as MaxDepth
    from TooMuchHierarchy)
select "user_id", Manager_Id 
from AddMaxDepth
where Depth = MaxDepth

The line where Depth < 100 is what keeps you from getting the max recursion error. Make this number smaller, and less records will be produced that need to be thrown away. Make it too small and employees won't be returned, so make sure it is at least as large as the depth of the org chart being stored. Bit of a maintence nightmare as the company grows. If it needs to be bigger, then add option (maxrecursion ... number ...) to whole thing to allow more recursion.

Approach 2:

; with Hierarchy as (
    select "User_ID"
        , Manager_ID 
        , '#' + cast("user_id" as varchar(max)) + '#' as user_id_list
    from "User" 
    WHERE "User_ID" = @UserID
    union all
    select U."User_ID"
        , U.Manager_ID
        , M.user_id_list + '#' + cast(U."user_id" as varchar(max)) + '#' as user_id_list
    from Hierarchy M
    inner join "User" U 
        on U.Manager_ID = M."user_id"
    where user_id_list not like '%#' + cast(U."User_id" as varchar(max)) + '#%')
select "user_id", Manager_Id 
from Hierarchy
要走就滚别墨迹 2024-08-04 14:33:24

基本上,如果数据中有这样的循环,您就必须自己执行检索逻辑。
你可以使用一个 cte 来吸引下属,而使用另一个 cte 来吸引老板。

另一个想法是为两个公司所有者设立一个虚拟的老板,这样他们就不会成为彼此的老板,这是荒谬的。 这是我的首选。

basicaly if you have loops like this in data you'll have to do the retreival logic by yourself.
you could use one cte to get only subordinates and other to get bosses.

another idea is to have a dummy row as a boss to both company owners so they wouldn't be each others bosses which is ridiculous. this is my prefferd option.

栖竹 2024-08-04 14:31:43

我知道您不久前问过这个问题,但这里有一个解决方案可以用于检测无限递归循环。 我生成一个路径,并在 CTE 条件中检查用户 ID 是否在路径中,如果是,则不会再次处理它。 希望这可以帮助。

何塞

DECLARE @Table TABLE(
    USER_ID INT,
    MANAGER_ID INT )
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 1, 2
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 2, 1
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 3, 1
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 4, 3
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 5, 2

DECLARE @UserID INT
SELECT @UserID = 1

;with
UserTbl as -- Selects an employee and his subordinates.
(
    select 
        '/'+cast( a.USER_ID as varchar(max)) as [path],
        a.[User_ID], 
        a.[Manager_ID] 
    from @Table a 
    where [User_ID] = @UserID
    union all
    select
        b.[path] +'/'+ cast( a.USER_ID as varchar(max)) as [path],
        a.[User_ID], 
        a.[Manager_ID] 
    from @Table a 
    inner join UserTbl b 
        on (a.[Manager_ID]=b.[User_ID])
    where charindex('/'+cast( a.USER_ID as varchar(max))+'/',[path]) = 0
)
select * from UserTbl

I know you asked this question a while ago, but here is a solution that may work for detecting infinite recursive loops. I generate a path and I checked in the CTE condition if the USER ID is in the path, and if it is it wont process it again. Hope this helps.

Jose

DECLARE @Table TABLE(
    USER_ID INT,
    MANAGER_ID INT )
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 1, 2
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 2, 1
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 3, 1
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 4, 3
INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 5, 2

DECLARE @UserID INT
SELECT @UserID = 1

;with
UserTbl as -- Selects an employee and his subordinates.
(
    select 
        '/'+cast( a.USER_ID as varchar(max)) as [path],
        a.[User_ID], 
        a.[Manager_ID] 
    from @Table a 
    where [User_ID] = @UserID
    union all
    select
        b.[path] +'/'+ cast( a.USER_ID as varchar(max)) as [path],
        a.[User_ID], 
        a.[Manager_ID] 
    from @Table a 
    inner join UserTbl b 
        on (a.[Manager_ID]=b.[User_ID])
    where charindex('/'+cast( a.USER_ID as varchar(max))+'/',[path]) = 0
)
select * from UserTbl
吻风 2024-08-04 14:29:52

您不必递归地执行此操作。 它可以在 WHILE 循环中完成。 我保证它会更快:每次我对这两种技术进行计时时,这对我来说都是如此。 这听起来效率低下,但事实并非如此,因为循环次数就是递归级别。 在每次迭代中,您都可以检查循环并纠正发生循环的位置。 您还可以对临时表施加约束,以便在发生循环时引发错误,尽管您似乎更喜欢更优雅地处理循环的方法。 当 while 循环迭代一定数量的级别时,您还可以触发错误(以捕获未检测到的循环? - 哦,天哪,有时会发生这种情况。

诀窍是重复插入到临时表中(该表由根条目填充) ),包括具有当前迭代编号的列,并在临时表中的最新结果和原始表中的子条目之间进行内连接,只需在 @@rowcount=0 时跳出循环即可!
简单吧?

You don't have to do it recursively. It can be done in a WHILE loop. I guarantee it will be quicker: well it has been for me every time I've done timings on the two techniques. This sounds inefficient but it isn't since the number of loops is the recursion level. At each iteration you can check for looping and correct where it happens. You can also put a constraint on the temporary table to fire an error if looping occurs, though you seem to prefer something that deals with looping more elegantly. You can also trigger an error when the while loop iterates over a certain number of levels (to catch an undetected loop? - oh boy, it sometimes happens.

The trick is to insert repeatedly into a temporary table (which is primed with the root entries), including a column with the current iteration number, and doing an inner join between the most recent results in the temporary table and the child entries in the original table. Just break out of the loop when @@rowcount=0!
Simple eh?

煮茶煮酒煮时光 2024-08-04 14:28:26

不是通用的解决方案,但可能适合您的情况:在您的选择查询中将其修改

select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])

为:

select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID]) 
   and a.[User_ID] <> @UserID

Not a generic solution, but might work for your case: in your select query modify this:

select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])

to become:

select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID]) 
   and a.[User_ID] <> @UserID
三生路 2024-08-04 14:27:03

这适用于初始递归链接,但可能不适用于较长的链接,

DECLARE @Table TABLE(
        ID INT,
        PARENTID INT
)

INSERT INTO @Table (ID,PARENTID) SELECT 1, 2

INSERT INTO @Table (ID,PARENTID) SELECT 2, 1

INSERT INTO @Table (ID,PARENTID) SELECT 3, 1

INSERT INTO @Table (ID,PARENTID) SELECT 4, 3

INSERT INTO @Table (ID,PARENTID) SELECT 5, 2


SELECT * FROM @Table

DECLARE @ID INT

SELECT @ID = 1

;WITH boss (ID,PARENTID) AS (
    SELECT  ID,
            PARENTID
    FROM    @Table
    WHERE   PARENTID = @ID
),
 bossChild (ID,PARENTID) AS (
    SELECT  ID,
            PARENTID
    FROM    boss
    UNION ALL
    SELECT  t.ID,
            t.PARENTID
    FROM    @Table t INNER JOIN
            bossChild b ON t.PARENTID = b.ID
    WHERE   t.ID NOT IN (SELECT PARENTID FROM boss)
)
SELECT  *
FROM    bossChild
OPTION (MAXRECURSION 0)

我建议使用 while 循环,并且仅在 id 不存在时才将链接插入到临时表中,从而消除无限循环。

this will work for the initial recursive link, but might not work for longer links

DECLARE @Table TABLE(
        ID INT,
        PARENTID INT
)

INSERT INTO @Table (ID,PARENTID) SELECT 1, 2

INSERT INTO @Table (ID,PARENTID) SELECT 2, 1

INSERT INTO @Table (ID,PARENTID) SELECT 3, 1

INSERT INTO @Table (ID,PARENTID) SELECT 4, 3

INSERT INTO @Table (ID,PARENTID) SELECT 5, 2


SELECT * FROM @Table

DECLARE @ID INT

SELECT @ID = 1

;WITH boss (ID,PARENTID) AS (
    SELECT  ID,
            PARENTID
    FROM    @Table
    WHERE   PARENTID = @ID
),
 bossChild (ID,PARENTID) AS (
    SELECT  ID,
            PARENTID
    FROM    boss
    UNION ALL
    SELECT  t.ID,
            t.PARENTID
    FROM    @Table t INNER JOIN
            bossChild b ON t.PARENTID = b.ID
    WHERE   t.ID NOT IN (SELECT PARENTID FROM boss)
)
SELECT  *
FROM    bossChild
OPTION (MAXRECURSION 0)

what i would recomend is to use a while loop, and only insert links into temp table if the id does not already exist, thus removing endless loops.

半﹌身腐败 2024-08-04 14:25:46

我知道已经有一段时间了,但我认为我应该分享我的经验,因为我尝试了每一个解决方案,这里是我的发现摘要(也许是这篇文章?):

  • 添加具有当前路径的列确实有效,但性能受到影响所以这对我来说不是一个选择。
  • 我找不到使用 CTE 的方法。
  • 我编写了一个递归 SQL 函数,它将employeeIds 添加到表中。 为了避免循环引用,需要进行检查以确保没有重复的 ID 添加到表中。 表现一般,但并不理想。

完成所有这些后,我想到了将[合格]员工的整个子集转储到代码(C#)中并使用递归方法对其进行过滤的想法。 然后,我将筛选后的员工列表写入数据表,并将其作为临时表导出到我的存储过程中。 令我难以置信的是,事实证明,对于小型表和相对较大的表(我尝试了最多 35,000 行的表)来说,这都是最快、最灵活的方法。

I know it has been a while but thought I should share my experience as I tried every single solution and here is a summary of my findings (an maybe this post?):

  • Adding a column with the current path did work but had a performance hit so not an option for me.
  • I could not find a way to do it using CTE.
  • I wrote a recursive SQL function which adds employeeIds to a table. To get around the circular referencing, there is a check to make sure no duplicate IDs are added to the table. The performance was average but was not desirable.

Having done all of that, I came up with the idea of dumping the whole subset of [eligible] employees to code (C#) and filter them there using a recursive method. Then I wrote the filtered list of employees to a datatable and export it to my stored procedure as a temp table. To my disbelief, this proved to be the fastest and most flexible method for both small and relatively large tables (I tried tables of up to 35,000 rows).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文