用于检测有向图中循环的有效算法
是否有一种有效的算法来检测有向图中的循环?
我有一个有向图,表示需要执行的作业的时间表,作业是节点,依赖项是边。 我需要检测该图中导致循环依赖的循环错误情况。
最好检测所有周期,以便一次性修复它们。
Is there an efficient algorithm for detecting cycles within a directed graph?
I have a directed graph representing a schedule of jobs that need to be executed, a job being a node and a dependency being an edge. I need to detect the error case of a cycle within this graph leading to cyclic dependencies.
It would be better to detect all cycles so they could be fixed in one go.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
Tarjan 的强连通分量算法 具有
O(|E| + |V|)
时间复杂度。对于其他算法,请参阅 Wikipedia 上的强连通分量。
Tarjan's strongly connected components algorithm has
O(|E| + |V|)
time complexity.For other algorithms, see Strongly connected components on Wikipedia.
鉴于这是一个作业计划,我怀疑在某个时候您会将它们按建议的执行顺序进行排序。
如果是这种情况,那么 拓扑排序 实现在任何情况下都可以检测周期。 UNIX
tsort
确实如此。 我认为,因此在排序的同时检测循环可能比在单独的步骤中更有效。因此,问题可能会变成“如何最有效地进行排序”,而不是“如何最有效地检测循环”。 答案可能是“使用库”,但以下维基百科文章却未能做到这一点:
具有一种算法的伪代码,以及来自 Tarjan 的另一个的简要描述。 两者的时间复杂度都是 O(|V| + |E|)。
Given that this is a schedule of jobs, I suspect that at some point you are going to sort them into a proposed order of execution.
If that's the case, then a topological sort implementation may in any case detect cycles. UNIX
tsort
certainly does. I think it is likely that it is therefore more efficient to detect cycles at the same time as tsorting, rather than in a separate step.So the question might become, "how do I most efficiently tsort", rather than "how do I most efficiently detect loops". To which the answer is probably "use a library", but failing that the following Wikipedia article:
has the pseudo-code for one algorithm, and a brief description of another from Tarjan. Both have
O(|V| + |E|)
time complexity.根据Cormen 等人,算法简介 (CLRS):
这在几个答案中都提到过; 这里我还将提供一个基于 CLRS 第 22 章的代码示例。 示例图如下所示。
CLRS 用于深度优先搜索的伪代码如下:
在 CLRS 图 22.4 的示例中,该图由两棵 DFS 树组成:一棵由节点 u 组成、v、x 和 y,以及节点 w 和 中的另一个z。 每棵树包含一个后边缘:一个从 x 到 v,另一个从 z 到 z(一个自环形)。
关键实现是,在
DFS-VISIT
函数中,迭代u
的邻居v
时,会遇到后沿,节点遇到GRAY
颜色。以下 Python 代码是 CLRS 伪代码的改编,添加了用于检测周期的
if
子句:请注意,在此示例中,未捕获 CLRS 伪代码中的
time
,因为我们只对检测周期感兴趣。 还有一些样板代码用于从边列表构建图的邻接列表表示。执行此脚本时,它会打印以下输出:
这些正是 CLRS 图 22.4 中示例中的后边缘。
According to Lemma 22.11 of Cormen et al., Introduction to Algorithms (CLRS):
This has been mentioned in several answers; here I'll also provide a code example based on chapter 22 of CLRS. The example graph is illustrated below.
CLRS' pseudo-code for depth-first search reads:
In the example in CLRS Figure 22.4, the graph consists of two DFS trees: one consisting of nodes u, v, x, and y, and the other of nodes w and z. Each tree contains one back edge: one from x to v and another from z to z (a self-loop).
The key realization is that a back edge is encountered when, in the
DFS-VISIT
function, while iterating over the neighborsv
ofu
, a node is encountered with theGRAY
color.The following Python code is an adaptation of CLRS' pseudocode with an
if
clause added which detects cycles:Note that in this example, the
time
in CLRS' pseudocode is not captured because we're only interested in detecting cycles. There is also some boilerplate code for building the adjacency list representation of a graph from a list of edges.When this script is executed, it prints the following output:
These are exactly the back edges in the example in CLRS Figure 22.4.
最简单的方法是对图进行深度优先遍历 (DFT)。
如果图有
n
个顶点,则这是一个O(n)
时间复杂度算法。 由于您可能必须从每个顶点开始进行 DFT,因此总复杂度变为O(n^2)
。您必须维护一个包含当前深度优先遍历中所有顶点的堆栈,其中第一个元素是根节点。 如果在 DFT 过程中遇到一个已经在堆栈中的元素,那么就会出现循环。
The simplest way to do it is to do a depth first traversal (DFT) of the graph.
If the graph has
n
vertices, this is aO(n)
time complexity algorithm. Since you will possibly have to do a DFT starting from each vertex, the total complexity becomesO(n^2)
.You have to maintain a stack containing all vertices in the current depth first traversal, with its first element being the root node. If you come across an element which is already in the stack during the DFT, then you have a cycle.
在我看来,检测有向图中循环的最容易理解的算法是图着色算法。
基本上,图着色算法以 DFS 方式遍历图(深度优先搜索,这意味着它在探索另一条路径之前完全探索一条路径)。 当它找到后边缘时,它将图形标记为包含循环。
有关图形着色算法的深入解释,请阅读这篇文章: http://www.geeksforgeeks.org/detect-cycle-direct-graph-using-colors/
此外,我还提供了 JavaScript 中图形着色的实现 https://github.com/dexcodeinc/graph_algorithm.js/blob/master/graph_algorithm.js
In my opinion, the most understandable algorithm for detecting cycle in a directed graph is the graph-coloring-algorithm.
Basically, the graph coloring algorithm walks the graph in a DFS manner (Depth First Search, which means that it explores a path completely before exploring another path). When it finds a back edge, it marks the graph as containing a loop.
For an in depth explanation of the graph coloring algorithm, please read this article: http://www.geeksforgeeks.org/detect-cycle-direct-graph-using-colors/
Also, I provide an implementation of graph coloring in JavaScript https://github.com/dexcodeinc/graph_algorithm.js/blob/master/graph_algorithm.js
从 DFS 开始:当且仅当在 DFS 期间发现后沿时,循环才存在。 这由白路定理得到证明。
Start with a DFS: a cycle exists if and only if a back-edge is discovered during DFS. This is proved as a result of white-path theorum.
如果无法向节点添加“已访问”属性,请使用集合(或映射)并将所有已访问节点添加到集合中,除非它们已经在集合中。 使用唯一的键或对象的地址作为“键”。
这还为您提供了有关循环依赖项的“根”节点的信息,当用户必须解决问题时,这将派上用场。
另一种解决方案是尝试找到下一个要执行的依赖项。 为此,您必须有一些堆栈,可以让您记住您现在所在的位置以及下一步需要做什么。 在执行之前检查依赖项是否已在此堆栈上。 如果是的话,你就找到了一个循环。
虽然这看起来复杂度为 O(N*M),但您必须记住,堆栈的深度非常有限(因此 N 很小),并且 M 会随着每个依赖项而变小,您可以将其检查为“已执行”加上当你找到叶子时,你可以停止搜索(所以你永远不必检查每个节点 - > M也会很小)。
在 MetaMake 中,我将图表创建为列表列表,然后在执行它们时删除每个节点,这自然减少了搜索量。 我实际上从未需要运行独立检查,这一切都在正常执行期间自动发生。
如果您需要“仅测试”模式,只需添加一个“空运行”标志即可禁用实际作业的执行。
If you can't add a "visited" property to the nodes, use a set (or map) and just add all visited nodes to the set unless they are already in the set. Use a unique key or the address of the objects as the "key".
This also gives you the information about the "root" node of the cyclic dependency which will come in handy when a user has to fix the problem.
Another solution is to try to find the next dependency to execute. For this, you must have some stack where you can remember where you are now and what you need to do next. Check if a dependency is already on this stack before you execute it. If it is, you've found a cycle.
While this might seem to have a complexity of O(N*M) you must remember that the stack has a very limited depth (so N is small) and that M becomes smaller with each dependency that you can check off as "executed" plus you can stop the search when you found a leaf (so you never have to check every node -> M will be small, too).
In MetaMake, I created the graph as a list of lists and then deleted every node as I executed them which naturally cut down the search volume. I never actually had to run an independent check, it all happened automatically during normal execution.
If you need a "test only" mode, just add a "dry-run" flag which disables the execution of the actual jobs.
没有算法可以在多项式时间内找到有向图中的所有环。 假设有向图有 n 个节点,每对节点之间都有连接,这意味着您有一个完整的图。 因此,这 n 个节点的任何非空子集都表示一个循环,并且这样的子集有 2^n-1 个。 所以不存在多项式时间算法。
因此,假设您有一个高效(非愚蠢)的算法,它可以告诉您图中有向循环的数量,您可以首先找到强连通分量,然后将您的算法应用于这些连通分量。 因为循环只存在于组件内部,而不存在于组件之间。
There is no algorithm which can find all the cycles in a directed graph in polynomial time. Suppose, the directed graph has n nodes and every pair of the nodes has connections to each other which means you have a complete graph. So any non-empty subset of these n nodes indicates a cycle and there are 2^n-1 number of such subsets. So no polynomial time algorithm exists.
So suppose you have an efficient (non-stupid) algorithm which can tell you the number of directed cycles in a graph, you can first find the strong connected components, then applying your algorithm on these connected components. Since cycles only exist within the components and not between them.
我已经在 sml (命令式编程)中实现了这个问题。 这是概要。 查找所有入度或出度为 0 的节点。 这些节点不能成为循环的一部分(因此将其删除)。 接下来删除这些节点的所有传入或传出边。
将此过程递归地应用于结果图。 如果最后没有留下任何节点或边,则该图没有任何循环,否则它有。
I had implemented this problem in sml ( imperative programming) . Here is the outline . Find all the nodes that either have an indegree or outdegree of 0 . Such nodes cannot be part of a cycle ( so remove them ) . Next remove all the incoming or outgoing edges from such nodes.
Recursively apply this process to the resulting graph. If at the end you are not left with any node or edge , the graph does not have any cycles , else it has.
https://mathoverflow.net/questions/16393/finding-a-cycle-of-固定长度 我最喜欢这个解决方案,特别是对于 4 长度:)
另外,物理向导说你必须做 O(V^2)。 我相信我们只需要 O(V)/O(V+E)。
如果图是连通的,那么 DFS 将访问所有节点。 如果图有连接的子图,那么每次我们在此子图的顶点上运行 DFS 时,我们都会找到连接的顶点,并且不必在下次运行 DFS 时考虑这些顶点。 因此每个顶点运行的可能性是不正确的。
https://mathoverflow.net/questions/16393/finding-a-cycle-of-fixed-length I like this solution the best specially for 4 length:)
Also phys wizard says u have to do O(V^2). I believe that we need only O(V)/O(V+E).
If the graph is connected then DFS will visit all nodes. If the graph has connected sub graphs then each time we run a DFS on a vertex of this sub graph we will find the connected vertices and wont have to consider these for the next run of the DFS. Therefore the possibility of running for each vertex is incorrect.
我这样做的方法是进行拓扑排序,计算访问的顶点数量。 如果该数字小于 DAG 中的顶点总数,则存在循环。
The way I do it is to do a Topological Sort, counting the number of vertices visited. If that number is less than the total number of vertices in the DAG, you have a cycle.
正如您所说,您有一组作业,需要按一定顺序执行。
拓扑排序
为您提供了作业调度所需的顺序(或者如果是直接无环图
,则用于依赖性问题)。 运行 dfs 并维护一个列表,并开始在列表的开头添加节点,如果遇到已经访问过的节点。 然后你在给定的图中找到了一个循环。As you said, you have set of jobs, it need to be executed in certain order.
Topological sort
given you required order for scheduling of jobs(or for dependency problems if it is adirect acyclic graph
). Rundfs
and maintain a list, and start adding node in the beginning of the list, and if you encountered a node which is already visited. Then you found a cycle in given graph.如果 DFS 找到一条指向已访问过的顶点的边,那么那里就有一个循环。
If DFS finds an edge that points to an already-visited vertex, you have a cycle there.
如果一个图满足这个属性
,那么该图至少包含一个循环。
If a graph satisfy this property
then the graph contains at least on cycle.