Python 设置不检测重复节点
我创建了一个 Graph
类以及一个我已经测试过的 Node
类。它正在进行基本测试。我现在尝试使用自定义输入文件进行测试,但遇到了一些 Node
被重复的错误。在Graph
类中,有一个名为Nodes
的集合,其中添加了正在创建的Node
。但是,在我的解析器文件中,我有一个检查器,用于检查之前是否已添加具有该值的节点,但它无法正常工作。
即使添加下面的哈希函数后,我仍然得到 set([Alpha, Beta, Hotel, Alpha])
。
我做错了什么?
测试文件:
directed unweighted
Alpha=Hotel
Alpha=Beta
Beta=Charlie
Charlie=Juliett
Charlie=Juliett
Delta=Foxtrot
Delta=Golf
Echo=Charlie
Echo=Delta
Foxtrot=Golf
Golf=Juliett
Golf=Alpha
Hotel=Echo
Hotel=India
India=Beta
India=Golf
Juliett=Golf
图和节点类:
class Graph:
def __init__(self):
self.Nodes = set()
def addVertex(self, vertex):
self.Nodes.add(vertex)
def getVertices(self):
return self.Nodes
@staticmethod
def addEdgeDirect(fromEdge, toEdge, cost=1):
fromEdge.neighbors[toEdge] = cost
class Node():
def __init__(self, val = None):
self.value = val
self.neighbors = {}
def getEdges(self):
return self.neighbors.keys()
def __repr__(self):
return str(self.value)
测试文件解析器:
from graph import Graph, Node
graph = Graph()
file1 = open('Graphs/du.gl', 'r')
Lines = file1.readlines()
direction = Lines[0].split(" ")[0].strip()
weight = Lines[0].split(" ")[1].strip()
count = 0
if(direction == "directed"):
if(weight == "unweighted"):
for line in Lines:
count += 1
if(count == 1):
continue
node1 = Node(line.split("=")[0].strip())
node2 = Node(line.split("=")[1].strip())
if node1 not in graph.Nodes:
print("not found, to be added")
graph.addVertex(Node(node1))
if node2 not in graph.Nodes:
graph.addVertex(Node(node2))
print(graph.getVertices())
graph.addEdgeDirect(node1,node2)
# print("Line{}: {}".format(count, line.strip()))
I have created a Graph
class along with a Node
class that I've tested. It is working with basic tests. I'm now trying to test with a custom input file and I'm running into an error where some of the Node
s are being duplicated. In the Graph
class, there is a set called Nodes
where the Node
being created is added. However, in my parser file I have a checker which checks to see if the Node with the value has been added before but it's not working properly.
Even after adding in the hash function below, I still get set([Alpha, Beta, Hotel, Alpha])
.
What am I doing wrong?
Test File:
directed unweighted
Alpha=Hotel
Alpha=Beta
Beta=Charlie
Charlie=Juliett
Charlie=Juliett
Delta=Foxtrot
Delta=Golf
Echo=Charlie
Echo=Delta
Foxtrot=Golf
Golf=Juliett
Golf=Alpha
Hotel=Echo
Hotel=India
India=Beta
India=Golf
Juliett=Golf
Graph and Node Class:
class Graph:
def __init__(self):
self.Nodes = set()
def addVertex(self, vertex):
self.Nodes.add(vertex)
def getVertices(self):
return self.Nodes
@staticmethod
def addEdgeDirect(fromEdge, toEdge, cost=1):
fromEdge.neighbors[toEdge] = cost
class Node():
def __init__(self, val = None):
self.value = val
self.neighbors = {}
def getEdges(self):
return self.neighbors.keys()
def __repr__(self):
return str(self.value)
Test File Parser:
from graph import Graph, Node
graph = Graph()
file1 = open('Graphs/du.gl', 'r')
Lines = file1.readlines()
direction = Lines[0].split(" ")[0].strip()
weight = Lines[0].split(" ")[1].strip()
count = 0
if(direction == "directed"):
if(weight == "unweighted"):
for line in Lines:
count += 1
if(count == 1):
continue
node1 = Node(line.split("=")[0].strip())
node2 = Node(line.split("=")[1].strip())
if node1 not in graph.Nodes:
print("not found, to be added")
graph.addVertex(Node(node1))
if node2 not in graph.Nodes:
graph.addVertex(Node(node2))
print(graph.getVertices())
graph.addEdgeDirect(node1,node2)
# print("Line{}: {}".format(count, line.strip()))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在
set
中,对象始终通过其对象 ID(引用)来区分。所以定义 __hash__ 也没有帮助。我建议:
addVertex
而不是Node
实例。通过其他一些小的更改,您的代码可能是:
附录
在评论中,您询问
getEdges
以及它如何返回更多信息。这是该方法的一个变体:
如果您执行以下操作:
...它将输出:
In a
set
objects are always distinguished by their object ID (reference). So it doesn't help to define__hash__
either.I would suggest to:
Node
instances when you already know that you need a new instance -- so after having checked that theNodes
dictionary doesn't have it yetaddVertex
instead of aNode
instance.With some other little changes, your code could be:
Addendum
In comments you ask about
getEdges
and how it could return more information.Here is a variant of that method:
If you then do this:
...it will output:
您期望该集合应包含具有唯一名称的节点,但您没有指定唯一性应取决于这些节点的属性
value
。您应该将以下方法添加到节点类中:
You are expecting that the set should contain nodes with unique names, but nowhere do you specify that uniqueness should depend on the property
value
of those nodes.You should add the following method to your node class:
有一些问题,一些与缺乏类型检查有关,一些与类的设计有关。
首先,您有一个
Node
类,您隐含的实例可能应该具有唯一的self.value
,并且该self.value
code> 期望始终是一个字符串(尽管这些期望不包含在代码中)。导致
set()
行为的一个问题是缺少__hash__()
方法,这一问题已在另一条评论中解决。由于您似乎希望两个节点相等当且仅当它们的value
参数是相同的字符串时,因此需要添加。但是,为了让
set()
按您想要的方式工作,您还需要添加一个__eq__()
函数。我不清楚节点的“邻居”属性是否对其身份重要,因为术语
node
和graph
不携带有关类实际是什么的信息或代表,所以也许neighbors
也应该包含在__eq__
中。添加这些方法后,循环体仍然不正确。问题线是
graph.addVertex(Node(node1))
,特别是Node(node1)
。据说这是为了创建 node1 的副本,但它实际上初始化了一个新节点,其中 newnode.value 现在是 Node 的实例,而不是字符串。使用类型提示和/或显式类型检查有助于捕获这些问题,例如,向初始化主体添加对isinstance(value, str)
的检查。从循环体中替换这两个条件,正确的版本是:
There's a few issues, some related to a lack of type checking and some from the design of your class.
First, you have a
Node
class, instances of which you've kind of implied maybe should have a uniqueself.value
, and thatself.value
is expected to always be a string (although these expectations are not contained in the code).One problem causing the
set()
behavior, addressed in another comment, is the lack of a__hash__()
method. Since it seems like you maybe want two Nodes to be equal if and only if theirvalue
parameter is the same string, addingis needed. However, for
set()
to work like you want, you also need to add an__eq__()
function.It's unclear to me whether the 'neighbors' attribute of a node should matter for its identity, as the terms
node
andgraph
don't carry information about what the classes actually are or represent, so maybeneighbors
should be included in__eq__
too.After adding those methods, the body of your loop is still incorrect. The problem line is
graph.addVertex(Node(node1))
specifically theNode(node1)
. Supposedly that was intended to create a copy of node1, but it actually initializes a new node, where newnode.value is now an instance of Node, not a string. Using type hints and/or explicit type checking helps catch those problems, for example, adding a check forisinstance(value, str)
to the initialization body.Replacing those two conditionals from the loop body, the correct version would be: