Python 中计算图连通分量的算法

发布于 2024-09-28 21:56:38 字数 1511 浏览 8 评论 0原文

我尝试编写一个脚本来计算图形的连接组件，但我无法得到正确的解决方案。我有一个带有 6 个节点（顶点）的简单图，节点 1 和 2 连接，节点 3 和 4 连接（6 个顶点；1-2,3-4,5,6）。因此该图包含 4 个连通分量。我使用以下脚本来计算连接的组件，但得到错误的结果 (2)。

nodes = [[1, [2], False], [2, [1], False], [3, [4], False], [4, [3], False], [5, [], False], [6, [], False]]
# 6 nodes, every node has an id, list of connected nodes and boolean whether the node has already been visited    

componentsCount = 0

def mark_nodes( list_of_nodes):
    global componentsCount
    componentsCount = 0
    for node in list_of_nodes:
      node[2] = False
      mark_node_auxiliary( node)

def mark_node_auxiliary( node): 
    global componentsCount
    if not node[2] == True: 
      node[2] = True
      for neighbor in node[1]:
        nodes[neighbor - 1][2] = True
        mark_node_auxiliary( nodes[neighbor - 1])
    else:
      unmarkedNodes = []
      for neighbor in node[1]:
        if not nodes[neighbor - 1][2] == True:  # This condition is never met. WHY???
          unmarkedNodes.append( neighbor)
          componentsCount += 1   
      for unmarkedNode in unmarkedNodes:
        mark_node_auxiliary( nodes[unmarkedNode - 1])

def get_connected_components_number( graph):
    result = componentsCount
    mark_nodes( graph)
    for node in nodes:
      if len( node[1]) == 0:      # For every vertex without neighbor...  
        result += 1               # ... increment number of connected components by 1.
    return result

print get_connected_components_number( nodes)

谁能帮我找出错误吗？

原文

I try to write a script that counts connected components of a graph and I can't get the right solution.
I have a simple graph with 6 nodes (vertexes), nodes 1 and 2 are connected, and nodes 3 and 4 are connected (6 vertexes; 1-2,3-4,5,6). So the graph contains 4 connected components. I use following script to count connected components, but I get wrong result (2).

nodes = [[1, [2], False], [2, [1], False], [3, [4], False], [4, [3], False], [5, [], False], [6, [], False]]
# 6 nodes, every node has an id, list of connected nodes and boolean whether the node has already been visited    

componentsCount = 0

def mark_nodes( list_of_nodes):
    global componentsCount
    componentsCount = 0
    for node in list_of_nodes:
      node[2] = False
      mark_node_auxiliary( node)

def mark_node_auxiliary( node): 
    global componentsCount
    if not node[2] == True: 
      node[2] = True
      for neighbor in node[1]:
        nodes[neighbor - 1][2] = True
        mark_node_auxiliary( nodes[neighbor - 1])
    else:
      unmarkedNodes = []
      for neighbor in node[1]:
        if not nodes[neighbor - 1][2] == True:  # This condition is never met. WHY???
          unmarkedNodes.append( neighbor)
          componentsCount += 1   
      for unmarkedNode in unmarkedNodes:
        mark_node_auxiliary( nodes[unmarkedNode - 1])

def get_connected_components_number( graph):
    result = componentsCount
    mark_nodes( graph)
    for node in nodes:
      if len( node[1]) == 0:      # For every vertex without neighbor...  
        result += 1               # ... increment number of connected components by 1.
    return result

print get_connected_components_number( nodes)

Can anyone please help me find the mistake?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小糖芽 2024-10-05 21:56:38

不相交集数据结构确实可以帮助您在这里编写清晰的代码，请参阅维基百科。

基本思想是将一个集合与图中的每个节点相关联，并且对于每条边，合并其两个端点的集合。如果x.find() == y.find()，则两个集合x和y是相同的。

这是最简单的实现（它有糟糕的最坏情况复杂性），但是维基百科页面上有一些对 DisjointSet 类的优化，上面的几行额外的代码使得这种方法变得高效。为了清楚起见，我省略了它们。

nodes = [[1, [2]], [2, [1]], [3, [4]], [4, [3]], [5, []], [6, []]]

def count_components(nodes):
    sets = {}
    for node in nodes:
      sets[node[0]] = DisjointSet()
    for node in nodes:
        for vtx in node[1]:
            sets[node[0]].union(sets[vtx])
    return len(set(x.find() for x in sets.itervalues()))

class DisjointSet(object):
    def __init__(self):
        self.parent = None

    def find(self):
        if self.parent is None: return self
        return self.parent.find()

    def union(self, other):
        them = other.find()
        us = self.find()
        if them != us:
            us.parent = them

print count_components(nodes)

A disjoint-set datastructure will really help you to write clear code here, see Wikipedia.

The basic idea is that you associate a set with each node in your graph, and for each edge you merge the sets of its two endpoints. Two sets x and y are the same if x.find() == y.find()

Here's the most naive implementation (which has bad worst-case complexity), but there's a couple of optimisations of the DisjointSet class on the wikipedia page above which in a handful of extra lines of code make this efficient. I omitted them for clarity.

nodes = [[1, [2]], [2, [1]], [3, [4]], [4, [3]], [5, []], [6, []]]

def count_components(nodes):
    sets = {}
    for node in nodes:
      sets[node[0]] = DisjointSet()
    for node in nodes:
        for vtx in node[1]:
            sets[node[0]].union(sets[vtx])
    return len(set(x.find() for x in sets.itervalues()))

class DisjointSet(object):
    def __init__(self):
        self.parent = None

    def find(self):
        if self.parent is None: return self
        return self.parent.find()

    def union(self, other):
        them = other.find()
        us = self.find()
        if them != us:
            us.parent = them

print count_components(nodes)

回复收藏 0 原文

清眉祭 2024-10-05 21:56:38

有时编写代码比阅读代码更容易。

通过一些测试，我很确定只要每个连接都是双向的（例如在您的示例中），它就始终可以工作。

def recursivelyMark(nodeID, nodes):
    (connections, visited) = nodes[nodeID]
    if visited:
        return
    nodes[nodeID][1] = True
    for connectedNodeID in connections:
        recursivelyMark(connectedNodeID, nodes)

def main():
    nodes = [[[1], False], [[0], False], [[3], False], [[2], False], [[], False], [[], False]]
    componentsCount = 0
    for (nodeID, (connections, visited)) in enumerate(nodes):
        if visited == False:
            componentsCount += 1
            recursivelyMark(nodeID, nodes)
    print(componentsCount)

if __name__ == '__main__':
    main()

请注意，我从节点信息中删除了 ID，因为它在数组中的位置就是它的 ID。如果这个程序不能满足您的需要，请告诉我。

Sometimes it's easier to write code than to read it.

Put this through some tests, I'm pretty sure it'll always work as long as every connection is bidirectional (such as in your example).

def recursivelyMark(nodeID, nodes):
    (connections, visited) = nodes[nodeID]
    if visited:
        return
    nodes[nodeID][1] = True
    for connectedNodeID in connections:
        recursivelyMark(connectedNodeID, nodes)

def main():
    nodes = [[[1], False], [[0], False], [[3], False], [[2], False], [[], False], [[], False]]
    componentsCount = 0
    for (nodeID, (connections, visited)) in enumerate(nodes):
        if visited == False:
            componentsCount += 1
            recursivelyMark(nodeID, nodes)
    print(componentsCount)

if __name__ == '__main__':
    main()

Note that I removed the ID from the node information since its position in the array is its ID. Let me know if this program doesn't do what you need.

回复收藏 0 原文

影子的影子 2024-10-05 21:56:38

我将我的答案放在这里以缓存我的学习成果。我用深度优先搜索来解决。

给定一个邻接列表，其图表如下所示：

深度优先搜索，递归地触及图中的所有节点。这部分很简单：

    count=0
    # I touch all the nodes and if dfs returns True, count+=1
    for node in graph:
        if dfs(node):
            count+=1

现在我们应该在 dfs 中编写逻辑。如果我们从节点 0 开始，我们将其标记为已访问，然后访问它的邻居。当我们访问邻居时，最终我们访问节点 2，如果我们到达节点 2，则意味着图已连接，因此我们返回 True。

    def dfs(node):
        if node in visited:
            return False
        visited.add(node)
        for neighbor in graph[node]:
            dfs(neighbor)
        # If I get here that means i explored all
        return True

我们从节点0开始，访问到节点2，返回True。由于我编写了 for node in graph:，现在它将从节点 1 开始，但由于节点 1 已经访问过，它将返回 False。

这是完整的代码：

class Solution:
    def count_connected(self,graph):
        visited=set()
        count=0
        def dfs(node):
            if node in visited:
                return False
            visited.add(node)
            for neighbor in graph[node]:
                dfs(neighbor)
            # If I get here that means i explored all
            return True
        # explore all the neightbors of nodes
        for node in graph:
            if dfs(node):
                count+=1
        return count

I place my answer here to cache my learning. I solve with depth first search.

given an adjacency list, its graph looks like this:

Depth first search, is recursively touching all the nodes in graph. this part is simple:

    count=0
    # I touch all the nodes and if dfs returns True, count+=1
    for node in graph:
        if dfs(node):
            count+=1

Now we should write the logic inside dfs. If we start from node 0, we mark it visited, and then we visit its neighbor. As we visit the neighbors, eventually we visit node 2, if we reach node 2 that means graph is connected so we return True.

    def dfs(node):
        if node in visited:
            return False
        visited.add(node)
        for neighbor in graph[node]:
            dfs(neighbor)
        # If I get here that means i explored all
        return True

we started from node 0, we visited till node 2, we returned True. Since I wrote for node in graph:, now it will start from node 1, but since node 1 already visited it will return False.

Here is the full code:

class Solution:
    def count_connected(self,graph):
        visited=set()
        count=0
        def dfs(node):
            if node in visited:
                return False
            visited.add(node)
            for neighbor in graph[node]:
                dfs(neighbor)
            # If I get here that means i explored all
            return True
        # explore all the neightbors of nodes
        for node in graph:
            if dfs(node):
                count+=1
        return count

回复收藏 0 原文

~没有更多了~