如何实现高效的双向哈希表?

发布于 2024-09-11 03:03:51 字数 207 浏览 3 评论 0 原文

Python dict 是一种非常有用的数据结构:

d = {'a': 1, 'b': 2}

d['a'] # get 1

有时您还想按值进行索引。

d[1] # get 'a'

实现这种数据结构的最有效方法是什么?有官方推荐的方法吗?

Python dict is a very useful data-structure:

d = {'a': 1, 'b': 2}

d['a'] # get 1

Sometimes you'd also like to index by values.

d[1] # get 'a'

Which is the most efficient way to implement this data-structure? Any official recommend way to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

厌倦 2024-09-18 03:03:51

这是一个双向 dict 类,灵感来自 从 Python 字典中的值查找键并进行修改以允许以下 2) 和 3)。

注意 :

代码:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value, []).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value, []).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key], []).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

使用示例:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}

Here is a class for a bidirectional dict, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).

Note that :

    1. The inverse directory bd.inverse auto-updates itself when the standard dict bd is modified.
    1. The inverse directory bd.inverse[value] is always a list of key such that bd[key] == value.
    1. Unlike the bidict module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.

Code:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value, []).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value, []).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key], []).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

Usage example:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}
薔薇婲 2024-09-18 03:03:51

您可以通过以相反的顺序添加键、值对来使用相同的字典本身。

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)

You can use the same dict itself by adding key,value pair in reverse order.

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)
终难愈 2024-09-18 03:03:51

穷人的双向哈希表将仅使用两个字典(这些已经是高度调整的数据结构)。

索引上还有一个 bidict

  • : python.org/pypi/bidict" rel="noreferrer">https://pypi.python.org/pypi/bidict

bidict 的源代码可以在 github 上找到:

A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).

There is also a bidict package on the index:

The source for bidict can be found on github:

嘿哥们儿 2024-09-18 03:03:51

下面的代码片段实现了可逆(双射)映射:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

此实现的优点是 BijectiveMapinverse 属性又是一个 BijectiveMap。因此,您可以执行以下操作:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True

The below snippet of code implements an invertible (bijective) map:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

The advantage of this implementation is that the inverse attribute of a BijectiveMap is again a BijectiveMap. Therefore you can do things like:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True
花桑 2024-09-18 03:03:51

也许是这样的:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

如果多个键具有给定值,您必须决定要发生什么;给定对的双向性很容易被您稍后插入的某个对破坏。我实施了一种可能的选择。


示例:

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b        

Something like this, maybe:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.


Example :

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b        
苏辞 2024-09-18 03:03:51

首先,必须确保键到值映射是一对一的,否则无法构建双向映射。

其次,数据集有多大?如果数据不多,就用2个单独的地图,更新的时候把两个地图都更新一下。或者更好的是,使用现有的解决方案,例如 Bidict,它只是 2 个字典的包装,具有更新/删除功能 但如果数据集

很大,维护 2 个字典是不可取的:

  • 如果 key 和 value 都是数字,请考虑使用的可能性
    插值以近似映射。如果绝大多数
    键值对可以被映射函数覆盖(及其
    相反函数),那么你只需要在maps中记录异常值。

  • 如果大多数访问是单向的(键->值),那么它完全是
    可以逐步构建反向地图,以换取时间

代码:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]

First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.

Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.

But if the dataset is large, and maintaining 2 dicts is not desirable:

  • If both key and value are numeric, consider the possibility of using
    Interpolation to approximate the mapping. If the vast majority of the
    key-value pairs can be covered by the mapping function (and its
    reverse function), then you only need to record the outliers in maps.

  • If most of access is uni-directional (key->value), then it is totally
    ok to build the reverse map incrementally, to trade time for
    space.

Code:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]
若有似无的小暗淡 2024-09-18 03:03:51

更好的方法是将字典转换为元组列表,然后对特定元组字段

def convert_to_list(dictionary):
    list_of_tuples = []
    for key, value in dictionary.items():
        list_of_tuples.append((key, value))
    return list_of_tuples

def sort_list(list_of_tuples, field):
     return sorted(list_of_tuples, key=lambda x: x[field])

dictionary = {'a': 9, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
list_of_tuples = convert_to_list(dictionary)
print(sort_list(list_of_tuples, 1))

输出进行排序

[('b', 2), ('c', 3), ('d', 4), ('e', 5), ('a', 9)]

a better way is convert the dictionary to a list of tuples then sort on a specific tuple field

def convert_to_list(dictionary):
    list_of_tuples = []
    for key, value in dictionary.items():
        list_of_tuples.append((key, value))
    return list_of_tuples

def sort_list(list_of_tuples, field):
     return sorted(list_of_tuples, key=lambda x: x[field])

dictionary = {'a': 9, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
list_of_tuples = convert_to_list(dictionary)
print(sort_list(list_of_tuples, 1))

output

[('b', 2), ('c', 3), ('d', 4), ('e', 5), ('a', 9)]
自由如风 2024-09-18 03:03:51

不幸的是,评分最高的答案 bidict 不起作用。

有三个选项:

  1. 子类 dict:您可以创建 dict 的子类,但要小心。您需要编写updatepopinitializersetdefault的自定义实现。 dict 实现不会调用 __setitem__。这就是为什么评分最高的答案有问题。

  2. 继承自UserDict:这就像一个字典,只是所有例程都被正确调用。它在名为 data 的项目中使用了一个字典。您可以阅读 Python 文档,或者使用在 Python 3 中工作的按方向列表的简单实现。很抱歉没有逐字包含它:我不确定它的版权。

  3. 从抽象基类继承:从继承collections.abc 将帮助您获得新类的所有正确协议和实现。这对于双向字典来说是多余的,除非它也可以加密并缓存到数据库。

TL;DR -- 使用 this 作为您的代码。阅读 Trey Hunner文章 了解详细信息。

Unfortunately, the highest rated answer, bidict does not work.

There are three options:

  1. Subclass dict: You can create a subclass of dict, but beware. You need to write custom implementations ofupdate, pop, initializer, setdefault. The dict implementations do not call __setitem__. This is why the highest rated answer has issues.

  2. Inherit from UserDict: This is just like a dict, except all the routines are made to call correctly. It uses a dict under the hood, in an item called data. You can read the Python Documentation, or use a simple implementation of a by directional list that works in Python 3. Sorry for not including it verbatim: I'm unsure of its copyright.

  3. Inherit from Abstract Base Classes: Inheriting from collections.abc will help you get all the correct protocols and implementations for a new class. This is overkill for a bidirectional dictionary, unless it can also encrypt and cache to a database.

TL;DR -- Use this for your code. Read Trey Hunner's article for details.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文