SQL - 检测父子关系中的循环
我在 Excel 中有父子数据,该数据被加载到运行 MS SQL 服务器的第 3 方系统中。数据代表一个有向(希望是)非循环图。第 3 方意味着我在架构中没有完全自由的双手。 Excel 数据是其他文件的串联,并且可能存在以下情况:在各个文件之间的交叉引用中,有人造成了循环 - 即 X 是 Y 的子项(X->Y),然后在其他地方(Y->Y)。 A→BX)。我可以在 Excel 或 SQL Server 数据库上编写 vb、vba 等。 Excel 文件几乎有 30k 行,因此我担心随着数据的增长,组合会爆炸。因此,某些技术(例如创建包含所有路径的表)可能非常笨重。我正在考虑简单地编写一个程序,对于每个根,对每个叶子进行树遍历,如果深度大于某个标称值,则对其进行标记。
欢迎更好的建议或对先前讨论的指示。
I have parent child data in excel which gets loaded into a 3rd party system running MS SQL server. The data represents a directed (hopefully) acyclic graph. 3rd party means I don't have a completely free hand in the schema. The excel data is a concatenation of other files and the possibility exists that in the cross-references between the various files someone has caused a loop - i.e. X is a child of Y (X->Y) then elsewhere (Y->A->B-X). I can write vb, vba etc on the excel or on the SQL server db. The excel file is almost 30k rows so I'm worried about a combinatorial explosion as the data is set to grow. So some of the techniques like creating a table with all the paths might be pretty unwieldy. I'm thinking of simply writing a program that, for each root, does a tree traversal to each leaf and if the depth gets greater than some nominal value flags it.
Better suggestions or pointers to previous discussion welcomed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用递归 CTE 来检测循环:
这确实需要您指定最大递归级别:在本例中,CTE 运行到 55,并且它选择具有超过 50 个子项的错误行。
You can use a recursive CTE to detect loops:
This does require you to specify a maximum recursion level: in this case the CTE runs to 55, and it selects as erroneous rows with more than 50 children.