Python:列表中的意外排序

发布于 2024-10-25 08:20:46 字数 2636 浏览 3 评论 0原文

我在使用 Python 中的列表时遇到了奇怪的行为。我已经实现了一个返回整数列表的方法;特别是,这些是图表中的循环,每个循环都包含三个节点:

simple_cycles = compute_cycles(graph)

这会返回类似这样的内容:

[[40000,20000,30000],[700,500,600],[600,500,700],..]

现在,我需要 (1) 对列表中的每个列表进行排序,之后,我需要 (2) 从列表中删除重复项整个列表,(3) 我需要再次对整个列表进行排序。所需的结果可能如下所示:

[[500,600,700],[20000,30000,40000]]

任务 (1) 是通过在通过compute_cycles 返回内部列表之前对内部列表进行排序来实现的。任务 (2) 和 (3) 是通过执行以下行获得的:

cycles = dict((x[0], x) for x in simple_cycles).values()

这适用于处理的第一个图形。下面的每个图表都会失败,因为内部列表中的顺序有时是错误的。我尝试了最后一行源代码两次,第二次结果与预期不同。例如,我在第二次运行中得到了 as x:

[29837921, 27629939, 27646591]

而不是

[27629939, 27646591, 29837921]

这导致选择 29837921 作为字典中的键而不是 27629939。因此,sorted(x) 的初始排序似乎已经是错误的。但为什么?

我尝试在程序之外重现该行为,但我做不到。在我的应用程序中,我像这样解析 XML 文档:

detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)

..

def parse(self, infile, handler):
  parser = etree.XMLParser(target=handler)
  etree.parse(infile, parser)

例如,在执行时,

detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)

第二次运行的顺序是意外的。

我知道,我的源代码示例不好自己重现,但也许我在使用列表时缺少一些基本的 Python 内容。

更新

这是列表的创建:

from networkx import dfs_successors

def compute_cycles(graph):
  cycles = []
  for node in graph.nodes():
    a = graph.successors(node);
    for a_node in a:
      b = graph.successors(a_node)
      for next_node in b:
        c = graph.successors(next_node);
        if len(c) > 1:
          if c[0] == node:
            cycles.append(sorted([node, a_node, next_node]))
          elif c[1] == node:
            cycles.append(sorted([node, a_node, next_node]))
        else:
          if c == node:
            cycles.append(sorted([node, a_node, next_node]))
        #fi
      #rof
    #rof
  #rof
  return cycles

更新

如果犯了一个大错误:我已经覆盖了我的 Node 对象的 __repr__ 函数图形,以便它返回一个整数。也许排序失败是因为我处理的是真实对象而不是整数。我以这种方式更改了对 sort 函数的调用:

cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

我必须看看这是否会产生影响。节点类定义如下:

class Node(object):
  def __init__(self, revision, revision_hash):
    self.rev = revision
    self.revhash = revision_hash

  def __repr__(self):
    return repr((self.rev.revid))

I've encountering a weird behavior while working with lists in Python. I've implemented a method that returns a list of lists of Integers; in particular, those are cycles within a graph each including three nodes:

simple_cycles = compute_cycles(graph)

That returns me something like this:

[[40000,20000,30000],[700,500,600],[600,500,700],..]

Now, I need to (1) order each list of the list, and after that, I need to (2) remove duplicates from the entire list, and (3) I need to sort that entire list, again. The desired result then might look as follows:

[[500,600,700],[20000,30000,40000]]

Task (1) is achieved by sorting the internal lists prior to returning them via compute_cycles. Tasks (2) and (3) are obtained by executing the following line:

cycles = dict((x[0], x) for x in simple_cycles).values()

This works for the first graph processed. Each following graph fails, because the ordering within the internal lists is sometimes wrong. I tried the last source code line twice, and the second result was other than expected. For example, I got as x in the second run:

[29837921, 27629939, 27646591]

instead of

[27629939, 27646591, 29837921]

This result in choosing 29837921 as the key in the dictionary instead of 27629939. Thus, the initial ordering with sorted(x) seems already to be false. But why?

I tried to reproduce that behavior outside of my program, but I can't. In my application, I am parsing an XML document like this:

detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)

..

def parse(self, infile, handler):
  parser = etree.XMLParser(target=handler)
  etree.parse(infile, parser)

When executing, for example,

detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)

then the ordering of the second run is unexpected.

I know, my source code example is not good to reproduce it by yourself, but maybe I am missing some elemental Python stuff while working with lists.

Update

Here is the creation of the lists:

from networkx import dfs_successors

def compute_cycles(graph):
  cycles = []
  for node in graph.nodes():
    a = graph.successors(node);
    for a_node in a:
      b = graph.successors(a_node)
      for next_node in b:
        c = graph.successors(next_node);
        if len(c) > 1:
          if c[0] == node:
            cycles.append(sorted([node, a_node, next_node]))
          elif c[1] == node:
            cycles.append(sorted([node, a_node, next_node]))
        else:
          if c == node:
            cycles.append(sorted([node, a_node, next_node]))
        #fi
      #rof
    #rof
  #rof
  return cycles

Update

If made a big mistake: I've overwritten the __repr__ function of my Node object used within the graph, so that it returns an integer. Maybe, the sorting fails because I am dealing with real objects instead of integers. I changed my call to the sort function this way:

cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

I'll have to see if that makes a difference. The node class is defined as follows:

class Node(object):
  def __init__(self, revision, revision_hash):
    self.rev = revision
    self.revhash = revision_hash

  def __repr__(self):
    return repr((self.rev.revid))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

韵柒 2024-11-01 08:20:47

字典不一定保持顺序。他们被允许改变它。将其放入解释器中:{'a': 1, 'b': 2, 'c': 3}。我得到了{'a': 1, 'c': 3, 'b': 2}

Dictionaries do not necessarily keep the order. They are allowed to change it. Put this in the interpreter: {'a': 1, 'b': 2, 'c': 3}. I got {'a': 1, 'c': 3, 'b': 2}.

百合的盛世恋 2024-11-01 08:20:47

我的问题终于解决了。因为我将对象放在列表中而不是简单的 Integers 中,所以我必须使用 sort 方法,如下所示:

sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

在这里,我访问包含 Integer< 的成员变量/code>,已由 __str__ 返回。但是,排序时的隐式转换并不稳定。

My problem is finally solved. Because I put objects in lists instead of simple Integers, I had to use the sort method as follows:

sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

Here, I am accessing the member variable containing the Integer, which was already returned by __str__. However, the implicit conversion while sorting wasn't stable.

白鸥掠海 2024-11-01 08:20:46

我不明白你为什么使用dict

print sorted(set(tuple(sorted(x)) for x in L))

I don't understand why you're using dict.

print sorted(set(tuple(sorted(x)) for x in L))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文