Python 设置不检测重复节点

发布于 2025-01-11 17:34:18 字数 2061 浏览 0 评论 0原文

我创建了一个 Graph 类以及一个我已经测试过的 Node 类。它正在进行基本测试。我现在尝试使用自定义输入文件进行测试,但遇到了一些 Node 被重复的错误。在Graph类中,有一个名为Nodes的集合,其中添加了正在创建的Node。但是,在我的解析器文件中,我有一个检查器,用于检查之前是否已添加具有该值的节点,但它无法正常工作。
即使添加下面的哈希函数后,我仍然得到 set([Alpha, Beta, Hotel, Alpha])
我做错了什么?

测试文件:

directed unweighted
Alpha=Hotel
Alpha=Beta
Beta=Charlie
Charlie=Juliett
Charlie=Juliett
Delta=Foxtrot
Delta=Golf
Echo=Charlie
Echo=Delta
Foxtrot=Golf
Golf=Juliett
Golf=Alpha
Hotel=Echo
Hotel=India
India=Beta
India=Golf
Juliett=Golf

图和节点类:



class Graph:
    def __init__(self):
        self.Nodes = set()

    def addVertex(self, vertex):
        self.Nodes.add(vertex)
    
    def getVertices(self):
        return self.Nodes


    @staticmethod
    def addEdgeDirect(fromEdge, toEdge, cost=1):
        fromEdge.neighbors[toEdge] = cost


class Node():
    def __init__(self, val = None):
        self.value = val
        self.neighbors = {}

    def getEdges(self):
        return self.neighbors.keys()
    
    def __repr__(self):
        return str(self.value)

测试文件解析器:

from graph import Graph, Node

graph = Graph()

file1 = open('Graphs/du.gl', 'r')
Lines = file1.readlines()

direction = Lines[0].split(" ")[0].strip()
weight = Lines[0].split(" ")[1].strip() 

count = 0
if(direction == "directed"):
    if(weight == "unweighted"):
        for line in Lines:
            count += 1
            if(count == 1):
                continue
            node1 = Node(line.split("=")[0].strip())
            node2 = Node(line.split("=")[1].strip())
            if node1 not in graph.Nodes:
                print("not found, to be added")
                graph.addVertex(Node(node1))

            if node2 not in graph.Nodes:
                graph.addVertex(Node(node2))    

            print(graph.getVertices())
            graph.addEdgeDirect(node1,node2)
            # print("Line{}: {}".format(count, line.strip()))


I have created a Graph class along with a Node class that I've tested. It is working with basic tests. I'm now trying to test with a custom input file and I'm running into an error where some of the Nodes are being duplicated. In the Graph class, there is a set called Nodes where the Node being created is added. However, in my parser file I have a checker which checks to see if the Node with the value has been added before but it's not working properly.
Even after adding in the hash function below, I still get set([Alpha, Beta, Hotel, Alpha]).
What am I doing wrong?

Test File:

directed unweighted
Alpha=Hotel
Alpha=Beta
Beta=Charlie
Charlie=Juliett
Charlie=Juliett
Delta=Foxtrot
Delta=Golf
Echo=Charlie
Echo=Delta
Foxtrot=Golf
Golf=Juliett
Golf=Alpha
Hotel=Echo
Hotel=India
India=Beta
India=Golf
Juliett=Golf

Graph and Node Class:



class Graph:
    def __init__(self):
        self.Nodes = set()

    def addVertex(self, vertex):
        self.Nodes.add(vertex)
    
    def getVertices(self):
        return self.Nodes


    @staticmethod
    def addEdgeDirect(fromEdge, toEdge, cost=1):
        fromEdge.neighbors[toEdge] = cost


class Node():
    def __init__(self, val = None):
        self.value = val
        self.neighbors = {}

    def getEdges(self):
        return self.neighbors.keys()
    
    def __repr__(self):
        return str(self.value)

Test File Parser:

from graph import Graph, Node

graph = Graph()

file1 = open('Graphs/du.gl', 'r')
Lines = file1.readlines()

direction = Lines[0].split(" ")[0].strip()
weight = Lines[0].split(" ")[1].strip() 

count = 0
if(direction == "directed"):
    if(weight == "unweighted"):
        for line in Lines:
            count += 1
            if(count == 1):
                continue
            node1 = Node(line.split("=")[0].strip())
            node2 = Node(line.split("=")[1].strip())
            if node1 not in graph.Nodes:
                print("not found, to be added")
                graph.addVertex(Node(node1))

            if node2 not in graph.Nodes:
                graph.addVertex(Node(node2))    

            print(graph.getVertices())
            graph.addEdgeDirect(node1,node2)
            # print("Line{}: {}".format(count, line.strip()))


如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

盛夏尉蓝 2025-01-18 17:34:18

set 中,对象始终通过其对象 ID(引用)来区分。所以定义 __hash__ 也没有帮助。

我建议:

  • 使用字典而不是集合
  • 仅当您已经知道需要一个新实例时才创建 Node 实例 - 因此在检查了 Nodes 字典之后还没有
  • 结果:将字符串值传递给 addVertex 而不是 Node 实例。

通过其他一些小的更改,您的代码可能是:

class Graph:
    def __init__(self):
        self.nodes = {}

    def addVertex(self, key):
        if key not in self.nodes:
            self.nodes[key] = Node(key)
        return self.nodes[key]
    
    def getVertices(self):
        return self.nodes.values()

    @staticmethod
    def addEdgeDirect(fromEdge, toEdge, cost=1):
        fromEdge.neighbors[toEdge] = cost


class Node():
    def __init__(self, val=None):
        self.value = val
        self.neighbors = {}

    def getEdges(self):
        return self.neighbors.keys()
    
    def __repr__(self):
        return str(self.value)
    

graph = Graph()
file1 = open('du.gl', 'r')
firstline, *lines = file1.readlines()
direction, weight = firstline.split()
if direction == "directed":
    if weight == "unweighted":
        for line in lines:
            graph.addEdgeDirect(*(graph.addVertex(key.strip()) 
                                  for key in line.split("=")))
            print(graph.getVertices())

附录

在评论中,您询问 getEdges 以及它如何返回更多信息。

这是该方法的一个变体:

def getEdges(self):
    return { self: self.neighbors }

如果您执行以下操作:

print(graph.nodes['Alpha'].getEdges())

...它将输出:

{Alpha: {Hotel: 1, Beta: 1}}

In a set objects are always distinguished by their object ID (reference). So it doesn't help to define __hash__ either.

I would suggest to:

  • Use a dictionary instead of a set
  • Only create Node instances when you already know that you need a new instance -- so after having checked that the Nodes dictionary doesn't have it yet
  • By consequence: pass the string value to addVertex instead of a Node instance.

With some other little changes, your code could be:

class Graph:
    def __init__(self):
        self.nodes = {}

    def addVertex(self, key):
        if key not in self.nodes:
            self.nodes[key] = Node(key)
        return self.nodes[key]
    
    def getVertices(self):
        return self.nodes.values()

    @staticmethod
    def addEdgeDirect(fromEdge, toEdge, cost=1):
        fromEdge.neighbors[toEdge] = cost


class Node():
    def __init__(self, val=None):
        self.value = val
        self.neighbors = {}

    def getEdges(self):
        return self.neighbors.keys()
    
    def __repr__(self):
        return str(self.value)
    

graph = Graph()
file1 = open('du.gl', 'r')
firstline, *lines = file1.readlines()
direction, weight = firstline.split()
if direction == "directed":
    if weight == "unweighted":
        for line in lines:
            graph.addEdgeDirect(*(graph.addVertex(key.strip()) 
                                  for key in line.split("=")))
            print(graph.getVertices())

Addendum

In comments you ask about getEdges and how it could return more information.

Here is a variant of that method:

def getEdges(self):
    return { self: self.neighbors }

If you then do this:

print(graph.nodes['Alpha'].getEdges())

...it will output:

{Alpha: {Hotel: 1, Beta: 1}}
瀞厅☆埖开 2025-01-18 17:34:18

您期望该集合应包含具有唯一名称的节点,但您没有指定唯一性应取决于这些节点的属性value

您应该将以下方法添加到节点类中:


def __hash__(self):
    return hash(self.value)

You are expecting that the set should contain nodes with unique names, but nowhere do you specify that uniqueness should depend on the property value of those nodes.

You should add the following method to your node class:


def __hash__(self):
    return hash(self.value)

穿透光 2025-01-18 17:34:18

有一些问题,一些与缺乏类型检查有关,一些与类的设计有关。

首先,您有一个 Node 类,您隐含的实例可能应该具有唯一的 self.value,并且该 self.value code> 期望始终是一个字符串(尽管这些期望不包含在代码中)。

导致 set() 行为的一个问题是缺少 __hash__() 方法,这一问题已在另一条评论中解决。由于您似乎希望两个节点相等当且仅当它们的 value 参数是相同的字符串时,因此

def __hash__(self):
    return hash(self.value)

需要添加。但是,为了让 set() 按您想要的方式工作,您还需要添加一个 __eq__() 函数。

def __eq__(self, other):
    return isinstance(other, Node) and self.value == other.value

我不清楚节点的“邻居”属性是否对其身份重要,因为术语 nodegraph 不携带有关类实际是什么的信息或代表,所以也许 neighbors 也应该包含在 __eq__ 中。

添加这些方法后,循环体仍然不正确。问题线是graph.addVertex(Node(node1)),特别是Node(node1)。据说这是为了创建 node1 的副本,但它实际上初始化了一个新节点,其中 newnode.value 现在是 Node 的实例,而不是字符串。使用类型提示和/或显式类型检查有助于捕获这些问题,例如,向初始化主体添加对 isinstance(value, str) 的检查。

从循环体中替换这两个条件,正确的版本是:

if node1 not in graph.Nodes:
    graph.addVertex(node1)
if node2 not in graph.Nodes:
    graph.addVertex(node2)

There's a few issues, some related to a lack of type checking and some from the design of your class.

First, you have a Node class, instances of which you've kind of implied maybe should have a unique self.value, and that self.value is expected to always be a string (although these expectations are not contained in the code).

One problem causing the set() behavior, addressed in another comment, is the lack of a __hash__() method. Since it seems like you maybe want two Nodes to be equal if and only if their value parameter is the same string, adding

def __hash__(self):
    return hash(self.value)

is needed. However, for set() to work like you want, you also need to add an __eq__() function.

def __eq__(self, other):
    return isinstance(other, Node) and self.value == other.value

It's unclear to me whether the 'neighbors' attribute of a node should matter for its identity, as the terms node and graph don't carry information about what the classes actually are or represent, so maybe neighbors should be included in __eq__ too.

After adding those methods, the body of your loop is still incorrect. The problem line is graph.addVertex(Node(node1)) specifically the Node(node1). Supposedly that was intended to create a copy of node1, but it actually initializes a new node, where newnode.value is now an instance of Node, not a string. Using type hints and/or explicit type checking helps catch those problems, for example, adding a check for isinstance(value, str) to the initialization body.

Replacing those two conditionals from the loop body, the correct version would be:

if node1 not in graph.Nodes:
    graph.addVertex(node1)
if node2 not in graph.Nodes:
    graph.addVertex(node2)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文