如何使 heapq 评估特定属性的堆?

发布于 2024-09-28 01:15:21 字数 103 浏览 9 评论 0 原文

我希望持有一堆物体,而不仅仅是数字。它们将具有一个整数属性,堆可以根据该属性进行排序。在 python 中使用堆的最简单方法是 heapq,但是如何告诉它在使用 heapq 时按特定属性排序?

I wish to hold a heap of objects, not just numbers. They will have an integer attribute in them that the heap can sort by. The easiest way to use heaps in python is heapq, but how do I tell it to sort by a specific attribute when using heapq?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

白云不回头 2024-10-05 01:15:21

根据文档中的示例,您可以使用元组,并且它将按元组的第一个元素进行排序:

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))
>>> heappop(h)
(1, 'write spec')

因此,如果您不想(或不能?)执行 __cmp__ 方法,您可以在推送时手动提取排序键。

请注意,如果一对元组中的第一个元素相等,则将比较其他元素。如果这不是您想要的,您需要确保每个第一个元素都是唯一的。

According to the example from the documentation, you can use tuples, and it will sort by the first element of the tuple:

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))
>>> heappop(h)
(1, 'write spec')

So if you don't want to (or can't?) do a __cmp__ method, you can manually extract your sorting key at push time.

Note that if the first elements in a pair of tuples are equal, further elements will be compared. If this is not what you want, you need to ensure that each first element is unique.

提笔落墨 2024-10-05 01:15:21

heapq 以与 list.sort 相同的方式对对象进行排序,因此只需在类定义中定义一个方法 __cmp__() ,该方法会将自身与同一类的另一个实例:

def __cmp__(self, other):
    return cmp(self.intAttribute, other.intAttribute)

适用于 Python 2.x。

在3.x中使用:

def __lt__(self, other):
    return self.intAttribute < other.intAttribute

heapq sorts objects the same way list.sort does, so just define a method __cmp__() within your class definition, which will compare itself to another instance of the same class:

def __cmp__(self, other):
    return cmp(self.intAttribute, other.intAttribute)

Works in Python 2.x.

In 3.x use:

def __lt__(self, other):
    return self.intAttribute < other.intAttribute
拥醉 2024-10-05 01:15:21

根据官方文档,解决这个问题的方法是将条目存储为元组(请查看8.4.18.4.2节)。

例如,您的对象是这样的 tuple 格式
(key, value_1, value_2)

当您将对象(即元组)放入时,它将采用对象中的第一个属性(在这种情况下是比较的关键)。如果出现平局,堆将使用下一个属性(即 value_1),依此类推。

例如:

import heapq

heap = []
heapq.heappush(heap, (0,'one', 1))
heapq.heappush(heap, (1,'two', 11))
heapq.heappush(heap, (1, 'two', 2))
heapq.heappush(heap, (1, 'one', 3))
heapq.heappush(heap, (1,'two', 3))
heapq.heappush(heap, (1,'one', 4))
heapq.heappush(heap, (1,'two', 5))
heapq.heappush(heap, (1,'one', 1))

show_tree(heap)

输出:

                                      (0, 'one', 1)                                       
                (1, 'one', 1)                                (1, 'one', 4)                
    (1, 'one', 3)         (1, 'two', 3)         (1, 'two', 2)         (1, 'two', 5)     
(1, 'two', 11)

关于在 python 中漂亮打印堆(更新了链接): show_tree()

According to the Official Document, a solution to this is to store entries as tuples (please take a look at Section 8.4.1 and 8.4.2).

For example, your object is something like this in tuple's format
(key, value_1, value_2)

When you put the objects (i.e. tuples) into heap, it will take the first attribute in the object (in this case is key) to compare. If a tie happens, the heap will use the next attribute (i.e. value_1) and so on.

For example:

import heapq

heap = []
heapq.heappush(heap, (0,'one', 1))
heapq.heappush(heap, (1,'two', 11))
heapq.heappush(heap, (1, 'two', 2))
heapq.heappush(heap, (1, 'one', 3))
heapq.heappush(heap, (1,'two', 3))
heapq.heappush(heap, (1,'one', 4))
heapq.heappush(heap, (1,'two', 5))
heapq.heappush(heap, (1,'one', 1))

show_tree(heap)

Output:

                                      (0, 'one', 1)                                       
                (1, 'one', 1)                                (1, 'one', 4)                
    (1, 'one', 3)         (1, 'two', 3)         (1, 'two', 2)         (1, 'two', 5)     
(1, 'two', 11)

About pretty print a heap in python (updated the link): show_tree()

魄砕の薆 2024-10-05 01:15:21

Python 3 更新

这里的其他答案已经过时:

  • 有些是 Python 2 特定的。 __cmp__ 方法不再存在。
  • 有些不反映最佳实践,仅针对 __lt__,而不是 PEP 8
  • 有些不使用现代工具,例如 数据类attrgetter,或 total_ordering

使用数据类的现代解决方案

使用 数据类,很容易制作具有定制订购功能的数据持有者。例如,这里有一个 Person 类,它从比较顺序中排除 name 字段:

from dataclasses import dataclass, field

@dataclass(order=True)
class Person:
    name: str = field(compare=False)
    age: int

actors = [
    Person('T Hanks', 65),
    Person('E Olson', 33),
    Person('A Tapping', 58),
]

这与堆完美配合:

>>> heapify(actors)
>>> heappop(actors)
Person(name='E Olson', age=33)
>>> heappop(actors)
Person(name='A Tapping', age=58)
>>> heappop(actors)
Person(name='T Hanks', age=65)

处理现有类

有时,您必须按以下方式处理数据:提供,需要在不改变原有类的情况下控制比较顺序。

解决方案是添加一个包含新比较的包装器。这使得非原始数据及其类别保持不变。以下是添加此类包装器的现代方法:

from functools import total_ordering
from operator import attrgetter

def new_compare(*field_names):
    extract = attrgetter(*field_names)
    @total_ordering
    class ComparisonWrapper:
        def __init__(self, obj):
            self.obj = obj
        def __eq__(self, other):
            return extract(self.obj) == extract(other.obj)
        def __lt__(self, other):
            return extract(self.obj) < extract(other.obj)
    return ComparisonWrapper

例如,您可能会获得以下数据,但无法直接更改它或其类:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    def __repr__(self):
        return f'Person({self.name!r}, {self.age})'

actors = [
    Person('T Hanks', 65),
    Person('E Olson', 33),
    Person('A Tapping', 58),
]

可以使用 map()。要解开数据,请访问 obj 属性:

>>> from heapq import heapify, heappop

>>> data = list(map(new_compare('age'), actors))
>>> heapify(data)
>>> heappop(data).obj
Person('E Olson', 33)
>>> heappop(data).obj
Person('A Tapping', 58)
>>> heappop(data).obj
Person('T Hanks', 65)

包装器与装饰元组

现代文档,使用装饰元组的传统解决方案不再适用于某些基本用例。特别是,如果堆中的对象是函数,则 (priority, task) 形式的元组在 Python 3 中不再有效,因为函数无法进行比较。

新的建议是使用包装器,例如:

from dataclasses import dataclass, field
from typing import Any

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any=field(compare=False)

即使 item 对象不具有可比性,这也始终有效。

Python 3 Update

This other answers here are out-of-date:

  • Some are Python 2 specific. The __cmp__ method doesn't exist anymore.
  • Some do not reflect best practices and target only __lt__ instead of all the rich comparisons as recommended by PEP 8.
  • Some do not use modern tooling such as dataclasses, attrgetter, or total_ordering.

Modern solution with Dataclasses

With dataclasses, it is easy to make a data holder with customized ordering. For example, here is a Person class that excludes the name field from the comparison order:

from dataclasses import dataclass, field

@dataclass(order=True)
class Person:
    name: str = field(compare=False)
    age: int

actors = [
    Person('T Hanks', 65),
    Person('E Olson', 33),
    Person('A Tapping', 58),
]

This works perfectly with heaps:

>>> heapify(actors)
>>> heappop(actors)
Person(name='E Olson', age=33)
>>> heappop(actors)
Person(name='A Tapping', age=58)
>>> heappop(actors)
Person(name='T Hanks', age=65)

Handling Existing Classes

Sometimes you have to work with the data as provided and need to control the comparison order without changing the original class.

The solution is to add a wrapper with the new comparison. This leaves the unoriginal data and its class unchanged. Here is a modern recipe for adding such a wrapper:

from functools import total_ordering
from operator import attrgetter

def new_compare(*field_names):
    extract = attrgetter(*field_names)
    @total_ordering
    class ComparisonWrapper:
        def __init__(self, obj):
            self.obj = obj
        def __eq__(self, other):
            return extract(self.obj) == extract(other.obj)
        def __lt__(self, other):
            return extract(self.obj) < extract(other.obj)
    return ComparisonWrapper

For example, you may be given the following data and cannot alter it or its class directly:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    def __repr__(self):
        return f'Person({self.name!r}, {self.age})'

actors = [
    Person('T Hanks', 65),
    Person('E Olson', 33),
    Person('A Tapping', 58),
]

The wrapper can be applied gracefully with map(). To unwrap the data, access the obj attribute:

>>> from heapq import heapify, heappop

>>> data = list(map(new_compare('age'), actors))
>>> heapify(data)
>>> heappop(data).obj
Person('E Olson', 33)
>>> heappop(data).obj
Person('A Tapping', 58)
>>> heappop(data).obj
Person('T Hanks', 65)

Wrappers versus Decorating Tuples

As noted in the modern documentation, the traditional solution with decorating tuples no longer works for some essential use cases. In particular, if the objects in the heap are functions, a tuple in the form of (priority, task) no longer works in Python 3 because functions cannot be compared.

The new suggestion is to use a wrapper such as:

from dataclasses import dataclass, field
from typing import Any

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any=field(compare=False)

This will always work even if the item objects aren't comparable.

对你而言 2024-10-05 01:15:21

我觉得最简单的方法是覆盖 heapq 模块现有的 cmp_lt 函数。一个简短的例子:

import heapq

# your custom function. Here, comparing tuples a and b based on their 2nd element
def new_cmp_lt(self,a,b):
    return a[1]<b[1]

#override the existing "cmp_lt" module function with your function
heapq.cmp_lt=new_cmp_lt

#Now use everything like normally used

注意:如果这与推荐的编码实践相冲突,更有资格的人应该发表评论。但它对于“快速而肮脏”的事情仍然很有用,例如在时间有限且有很多事情要做的编码面试中,而不是将时间花在正确的子类化上。

I feel the simplest way is to override the existing cmp_lt function of the heapq module. A short example:

import heapq

# your custom function. Here, comparing tuples a and b based on their 2nd element
def new_cmp_lt(self,a,b):
    return a[1]<b[1]

#override the existing "cmp_lt" module function with your function
heapq.cmp_lt=new_cmp_lt

#Now use everything like normally used

Note: Somebody more qualified should comment if this conflicts with recommended coding practices. But it can still be useful for something "quick & dirty" e.g. in coding interviews with limited time and a lot more to do instead of spending time on subclassing correctly.

甜嗑 2024-10-05 01:15:21

我有同样的问题,但上面的答案都没有击中要害,尽管有些答案很接近但不够详细。不管怎样,我做了一些研究并尝试了这段代码,希望这对于接下来想要得到答案的人来说应该足够了:

使用元组的问题是它只使用第一个项目,这不是很灵活。我想要类似于 c++ 中的 std::priority_queue 的东西,如下所示:
std::priority_queue、向量>、比较器> pq;
我可以在这里设计自己的比较器,这在现实世界的应用中更常见。

希望下面的代码片段有帮助:
https://repl.it/@gururajks/EvenAccurateCylinders

import heapq
class PQNode:

    def __init__(self, key, value):
        self.key = key
        self.value = value

    # compares the second value
    def __lt__(self, other):
        return self.value < other.value

    def __str__(self):
        return str("{} : {}".format(self.key, self.value))

input = [PQNode(1, 4), PQNode(7, 4), PQNode(6, 9), PQNode(2, 5)]
hinput = []
for item in input:
    heapq.heappush(hinput, item)

while (hinput):
    print (heapq.heappop(hinput))

I had the same question but none of the above answers hit the spot although some were close but not elaborated enough. Anyway, I did some research and tried this piece of code and hopefully this should be sufficient for someone next who is looking to get an answer:

The problem with using a tuple is it only uses the first item which is not very flexible. I wanted something similar to std::priority_queue in c++ like this:
std::priority_queue<pair<int, int>, vector<pair<int, int>>, comparator> pq;
where I could design my own comparator which is more common in real world applications.

Hopefully the below snippet helps:
https://repl.it/@gururajks/EvenAccurateCylinders

import heapq
class PQNode:

    def __init__(self, key, value):
        self.key = key
        self.value = value

    # compares the second value
    def __lt__(self, other):
        return self.value < other.value

    def __str__(self):
        return str("{} : {}".format(self.key, self.value))

input = [PQNode(1, 4), PQNode(7, 4), PQNode(6, 9), PQNode(2, 5)]
hinput = []
for item in input:
    heapq.heappush(hinput, item)

while (hinput):
    print (heapq.heappop(hinput))
大姐,你呐 2024-10-05 01:15:21

不幸的是,你不能,尽管这是一个经常要求的功能。

一种选择是将(键,值)元组插入堆中。但是,如果这些值在比较时抛出异常(在键之间存在平局的情况下将对其进行比较),则该方法将不起作用。

第二种选择是在类中定义一个 __lt__(小于)方法,该方法将使用适当的属性来比较元素以进行排序。但是,如果对象是由另一个包创建的,或者您需要它们在程序中的其他地方进行不同的比较,则这可能是不可能的。

第三种选择是使用 sortedlist 类。 pypi.python.org/pypi/blist/" rel="nofollow">blist 模块(免责声明:我是作者)。 sortedlist 的构造函数采用一个 key 参数,让您指定一个函数来返回元素的排序键,类似于 key 参数list.sortsorted

Unfortunately, you can't, although this is an often requested feature.

One option would be to insert (key, value) tuples into the heap. However, that won't work if the values throw an exception when compared (they will be compared in the case of a tie between keys).

A second option would be to define a __lt__ (less-than) method in the class that will use the appropriate attribute to compare the elements for sorting. However, that might not be possible if the objects were created by another package or if you need them to compare differently elsewhere in the program.

A third option would be to use the sortedlist class from the blist module (disclaimer: I'm the author). The constructor for sortedlist takes a key parameter that lets you specify a function to return the sort key of an element, similar to the key parameter of list.sort and sorted.

冰雪之触 2024-10-05 01:15:21

你可以实现一个heapdict。请注意使用 popitem() 来获取优先级最低的项目。

import heapdict as hd
import string
import numpy as np

h = hd.heapdict()
keys = [char for char in string.ascii_lowercase[:10]]
vals = [i for i in np.random.randint(0,10, 10)]
for k,v in zip(keys,vals):
    h[k] = v
for i in range(len(vals)):
    print h.popitem()

You could implement a heapdict. Note the use of popitem() to get the lowest priority item.

import heapdict as hd
import string
import numpy as np

h = hd.heapdict()
keys = [char for char in string.ascii_lowercase[:10]]
vals = [i for i in np.random.randint(0,10, 10)]
for k,v in zip(keys,vals):
    h[k] = v
for i in range(len(vals)):
    print h.popitem()
洒一地阳光 2024-10-05 01:15:21

有一个名为heaps的模块。 Github地址是https://github.com/gekco/heapy。您可以在类实例化时或从数组创建堆时应用自己的键/排序函数,这非常有用,因为这可以让您在每次执行操作时将其添加为参数。

我想要列表元组最后位置的最小元素位于堆顶部的示例:

>>> from heapy.heap import Heap 
>>> a = [(3, 5, 10), (-5, 3, 8), (7, 8, 9), (-4, 0, 2)]
>>> x = Heap.from_array(a, key=lambda t : t[-1])
>>> x.length
4
>>> x.top()
(-4, 0, 2)
>>> x.insert((-1, 0, 1))
>>> x.length
5
>>> x.top()
(-1, 0, 1)
>>> a
[(3, 5, 10), (-5, 3, 8), (7, 8, 9), (-4, 0, 2)]
 

There is a module called heaps. The Github address is https://github.com/gekco/heapy. You can apply your own key / sort function at instantiation of the class or when creating the heap from an array, which is very useful as this saves you adding it as an argument every time you perform an action.

Example where I want the list what the smallest element at the last position of the tuple be on top of the heap:

>>> from heapy.heap import Heap 
>>> a = [(3, 5, 10), (-5, 3, 8), (7, 8, 9), (-4, 0, 2)]
>>> x = Heap.from_array(a, key=lambda t : t[-1])
>>> x.length
4
>>> x.top()
(-4, 0, 2)
>>> x.insert((-1, 0, 1))
>>> x.length
5
>>> x.top()
(-1, 0, 1)
>>> a
[(3, 5, 10), (-5, 3, 8), (7, 8, 9), (-4, 0, 2)]
 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文