如何实现高效的双向哈希表?
Python dict
是一种非常有用的数据结构:
d = {'a': 1, 'b': 2}
d['a'] # get 1
有时您还想按值进行索引。
d[1] # get 'a'
实现这种数据结构的最有效方法是什么?有官方推荐的方法吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
这是一个双向
dict
类,灵感来自 从 Python 字典中的值查找键并进行修改以允许以下 2) 和 3)。注意 :
bd.inverse
在标准字典bd
修改时自动更新。bd.inverse[value]
始终是key
的列表,使得bd[键] == 值
。bidict
模块不同pypi/bidict,这里我们可以有2个具有相同值的键,这是非常重要。代码:
使用示例:
Here is a class for a bidirectional
dict
, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).Note that :
bd.inverse
auto-updates itself when the standard dictbd
is modified.bd.inverse[value]
is always a list ofkey
such thatbd[key] == value
.bidict
module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.Code:
Usage example:
您可以通过以相反的顺序添加键、值对来使用相同的字典本身。
You can use the same dict itself by adding key,value pair in reverse order.
穷人的双向哈希表将仅使用两个字典(这些已经是高度调整的数据结构)。
索引上还有一个 bidict 包
bidict 的源代码可以在 github 上找到:
A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).
There is also a bidict package on the index:
The source for bidict can be found on github:
下面的代码片段实现了可逆(双射)映射:
此实现的优点是
BijectiveMap
的inverse
属性又是一个BijectiveMap
。因此,您可以执行以下操作:The below snippet of code implements an invertible (bijective) map:
The advantage of this implementation is that the
inverse
attribute of aBijectiveMap
is again aBijectiveMap
. Therefore you can do things like:也许是这样的:
如果多个键具有给定值,您必须决定要发生什么;给定对的双向性很容易被您稍后插入的某个对破坏。我实施了一种可能的选择。
示例:
Something like this, maybe:
You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.
Example :
首先,必须确保键到值映射是一对一的,否则无法构建双向映射。
其次,数据集有多大?如果数据不多,就用2个单独的地图,更新的时候把两个地图都更新一下。或者更好的是,使用现有的解决方案,例如 Bidict,它只是 2 个字典的包装,具有更新/删除功能 但如果数据集
很大,维护 2 个字典是不可取的:
如果 key 和 value 都是数字,请考虑使用的可能性
插值以近似映射。如果绝大多数
键值对可以被映射函数覆盖(及其
相反函数),那么你只需要在maps中记录异常值。
如果大多数访问是单向的(键->值),那么它完全是
可以逐步构建反向地图,以换取时间
代码:
First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.
Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.
But if the dataset is large, and maintaining 2 dicts is not desirable:
If both key and value are numeric, consider the possibility of using
Interpolation to approximate the mapping. If the vast majority of the
key-value pairs can be covered by the mapping function (and its
reverse function), then you only need to record the outliers in maps.
If most of access is uni-directional (key->value), then it is totally
ok to build the reverse map incrementally, to trade time for
space.
Code:
更好的方法是将字典转换为元组列表,然后对特定元组字段
输出进行排序
a better way is convert the dictionary to a list of tuples then sort on a specific tuple field
output
不幸的是,评分最高的答案
bidict
不起作用。有三个选项:
子类 dict:您可以创建
dict
的子类,但要小心。您需要编写update
、pop
、initializer
、setdefault
的自定义实现。dict
实现不会调用__setitem__
。这就是为什么评分最高的答案有问题。继承自UserDict:这就像一个字典,只是所有例程都被正确调用。它在名为
data
的项目中使用了一个字典。您可以阅读 Python 文档,或者使用在 Python 3 中工作的按方向列表的简单实现。很抱歉没有逐字包含它:我不确定它的版权。从抽象基类继承:从继承collections.abc 将帮助您获得新类的所有正确协议和实现。这对于双向字典来说是多余的,除非它也可以加密并缓存到数据库。
TL;DR -- 使用 this 作为您的代码。阅读 Trey Hunner 的 文章 了解详细信息。
Unfortunately, the highest rated answer,
bidict
does not work.There are three options:
Subclass dict: You can create a subclass of
dict
, but beware. You need to write custom implementations ofupdate
,pop
,initializer
,setdefault
. Thedict
implementations do not call__setitem__
. This is why the highest rated answer has issues.Inherit from UserDict: This is just like a dict, except all the routines are made to call correctly. It uses a dict under the hood, in an item called
data
. You can read the Python Documentation, or use a simple implementation of a by directional list that works in Python 3. Sorry for not including it verbatim: I'm unsure of its copyright.Inherit from Abstract Base Classes: Inheriting from collections.abc will help you get all the correct protocols and implementations for a new class. This is overkill for a bidirectional dictionary, unless it can also encrypt and cache to a database.
TL;DR -- Use this for your code. Read Trey Hunner's article for details.