Python:列表中的意外排序
我在使用 Python 中的列表时遇到了奇怪的行为。我已经实现了一个返回整数列表的方法;特别是,这些是图表中的循环,每个循环都包含三个节点:
simple_cycles = compute_cycles(graph)
这会返回类似这样的内容:
[[40000,20000,30000],[700,500,600],[600,500,700],..]
现在,我需要 (1) 对列表中的每个列表进行排序,之后,我需要 (2) 从列表中删除重复项整个列表,(3) 我需要再次对整个列表进行排序。所需的结果可能如下所示:
[[500,600,700],[20000,30000,40000]]
任务 (1) 是通过在通过compute_cycles 返回内部列表之前对内部列表进行排序来实现的。任务 (2) 和 (3) 是通过执行以下行获得的:
cycles = dict((x[0], x) for x in simple_cycles).values()
这适用于处理的第一个图形。下面的每个图表都会失败,因为内部列表中的顺序有时是错误的。我尝试了最后一行源代码两次,第二次结果与预期不同。例如,我在第二次运行中得到了 as x:
[29837921, 27629939, 27646591]
而不是
[27629939, 27646591, 29837921]
这导致选择 29837921 作为字典中的键而不是 27629939。因此,sorted(x) 的初始排序似乎已经是错误的。但为什么?
我尝试在程序之外重现该行为,但我做不到。在我的应用程序中,我像这样解析 XML 文档:
detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
..
def parse(self, infile, handler):
parser = etree.XMLParser(target=handler)
etree.parse(infile, parser)
例如,在执行时,
detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)
第二次运行的顺序是意外的。
我知道,我的源代码示例不好自己重现,但也许我在使用列表时缺少一些基本的 Python 内容。
更新
这是列表的创建:
from networkx import dfs_successors
def compute_cycles(graph):
cycles = []
for node in graph.nodes():
a = graph.successors(node);
for a_node in a:
b = graph.successors(a_node)
for next_node in b:
c = graph.successors(next_node);
if len(c) > 1:
if c[0] == node:
cycles.append(sorted([node, a_node, next_node]))
elif c[1] == node:
cycles.append(sorted([node, a_node, next_node]))
else:
if c == node:
cycles.append(sorted([node, a_node, next_node]))
#fi
#rof
#rof
#rof
return cycles
更新
如果犯了一个大错误:我已经覆盖了我的 Node 对象的 __repr__
函数图形,以便它返回一个整数。也许排序失败是因为我处理的是真实对象而不是整数。我以这种方式更改了对 sort
函数的调用:
cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))
我必须看看这是否会产生影响。节点类定义如下:
class Node(object):
def __init__(self, revision, revision_hash):
self.rev = revision
self.revhash = revision_hash
def __repr__(self):
return repr((self.rev.revid))
I've encountering a weird behavior while working with lists in Python. I've implemented a method that returns a list of lists of Integers; in particular, those are cycles within a graph each including three nodes:
simple_cycles = compute_cycles(graph)
That returns me something like this:
[[40000,20000,30000],[700,500,600],[600,500,700],..]
Now, I need to (1) order each list of the list, and after that, I need to (2) remove duplicates from the entire list, and (3) I need to sort that entire list, again. The desired result then might look as follows:
[[500,600,700],[20000,30000,40000]]
Task (1) is achieved by sorting the internal lists prior to returning them via compute_cycles. Tasks (2) and (3) are obtained by executing the following line:
cycles = dict((x[0], x) for x in simple_cycles).values()
This works for the first graph processed. Each following graph fails, because the ordering within the internal lists is sometimes wrong. I tried the last source code line twice, and the second result was other than expected. For example, I got as x in the second run:
[29837921, 27629939, 27646591]
instead of
[27629939, 27646591, 29837921]
This result in choosing 29837921 as the key in the dictionary instead of 27629939. Thus, the initial ordering with sorted(x) seems already to be false. But why?
I tried to reproduce that behavior outside of my program, but I can't. In my application, I am parsing an XML document like this:
detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
..
def parse(self, infile, handler):
parser = etree.XMLParser(target=handler)
etree.parse(infile, parser)
When executing, for example,
detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)
then the ordering of the second run is unexpected.
I know, my source code example is not good to reproduce it by yourself, but maybe I am missing some elemental Python stuff while working with lists.
Update
Here is the creation of the lists:
from networkx import dfs_successors
def compute_cycles(graph):
cycles = []
for node in graph.nodes():
a = graph.successors(node);
for a_node in a:
b = graph.successors(a_node)
for next_node in b:
c = graph.successors(next_node);
if len(c) > 1:
if c[0] == node:
cycles.append(sorted([node, a_node, next_node]))
elif c[1] == node:
cycles.append(sorted([node, a_node, next_node]))
else:
if c == node:
cycles.append(sorted([node, a_node, next_node]))
#fi
#rof
#rof
#rof
return cycles
Update
If made a big mistake: I've overwritten the __repr__
function of my Node object used within the graph, so that it returns an integer. Maybe, the sorting fails because I am dealing with real objects instead of integers. I changed my call to the sort
function this way:
cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))
I'll have to see if that makes a difference. The node class is defined as follows:
class Node(object):
def __init__(self, revision, revision_hash):
self.rev = revision
self.revhash = revision_hash
def __repr__(self):
return repr((self.rev.revid))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
字典不一定保持顺序。他们被允许改变它。将其放入解释器中:
{'a': 1, 'b': 2, 'c': 3}
。我得到了{'a': 1, 'c': 3, 'b': 2}
。Dictionaries do not necessarily keep the order. They are allowed to change it. Put this in the interpreter:
{'a': 1, 'b': 2, 'c': 3}
. I got{'a': 1, 'c': 3, 'b': 2}
.我的问题终于解决了。因为我将对象放在列表中而不是简单的
Integers
中,所以我必须使用sort
方法,如下所示:在这里,我访问包含
Integer< 的成员变量/code>,已由
__str__
返回。但是,排序时的隐式转换并不稳定。My problem is finally solved. Because I put objects in lists instead of simple
Integers
, I had to use thesort
method as follows:Here, I am accessing the member variable containing the
Integer
, which was already returned by__str__
. However, the implicit conversion while sorting wasn't stable.我不明白你为什么使用
dict
。I don't understand why you're using
dict
.