如何进行高级 Python 哈希自动生存？

发布于 2024-08-25 16:42:22 字数 672 浏览 14 评论 0原文

这个问题是关于在 Python 中实现完整的 Perl 自动生存。我知道以前有人问过类似的问题，到目前为止，最好的答案是“在 Python 中实现嵌套字典的最佳方法是什么？" .但是，我希望这样做：

a['x']['y'].append('z')

首先不声明 a['x']['y'] = []，或者更确切地说，不声明 a['x'] = {} 两者之一。（请注意，在 Perl 中，您可以执行 push @{$a->{x}{y}}, 'z';。）

我知道 dict 和 list 类不能混合，所以这很难，但我有兴趣看看是否有人有一个巧妙的解决方案，可能涉及从 dict 创建一个继承类，但定义了一个新的 < code>append 方法就可以了？

我也知道这可能会让一些 Python 纯粹主义者感到厌烦，他们会要求我坚持使用 Perl。但是，即使只是为了挑战，我也想看到一些东西。

原文

This question is about implementing the full Perl autovivification in Python. I know similar questions were asked before and so far the best answer is in "What is the best way to implement nested dictionaries in Python?" . However, I'm looking to do this:

a['x']['y'].append('z')

without declaring a['x']['y'] = [] first, or rather, not declaring a['x'] = {} either. (Note in Perl you can do push @{$a->{x}{y}}, 'z';.)

I know dict and list classes sorta don't mix, so this is hard, but I'm interested in seeing if someone has an ingenious solution probably involving creating an inherited class from dict but defined a new append method on it?

I also know this might throw off some Python purists who will ask me to stick with Perl. But, even just for a challenge, I'd like to see something.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

后来的我们 2024-09-01 16:42:22

a = collections.defaultdict(lambda: collections.defaultdict(list))

a = collections.defaultdict(lambda: collections.defaultdict(list))

回复收藏 0 原文

云柯 2024-09-01 16:42:22

也许这可以解决您对字典中任意数量“维度”的需求：

a= collections.defaultdict(list)

代码中唯一的变化是：

a['x', 'y'].append('z')

当然，这个解决方案是您想要的解决方案取决于两个条件：

您是否需要使用例如'轻松访问所有列表x' 在“第一维度”中，
无论您是否坚持 Perl 神奇地让您更满意的方式:)

如果这两个条件中的任何一个为真，我的解决方案都不会帮助您。

Perhaps this solves your need for any number of “dimensions” in your dictionary:

a= collections.defaultdict(list)

the only change in your code being:

a['x', 'y'].append('z')

Of course, this solution being the one you want depends on two conditions:

whether you need to easily access all lists with e.g. 'x' in the “first dimension“
whether you are stuck to the way Perl magically pleases you more :)

If either of these two conditions is true, my solution won't help you.

回复收藏 0 原文

飘逸的'云 2024-09-01 16:42:22

由于我们事先不知道是否需要字典或列表，所以不能将自动生存与列表结合起来。除非根据链接问题中 Nosklo 的答案，将列表“功能”添加到基础字典中。基本上假设键的“排序”顺序，并始终将其与列表方法一起使用。我已经这样做了作为示例：

class AutoVivification(dict):
    """Implementation of perl's autovivification feature. Has features from both dicts and lists,
    dynamically generates new subitems as needed, and allows for working (somewhat) as a basic type.
    """
    def __getitem__(self, item):
        if isinstance(item, slice):
            d = AutoVivification()
            items = sorted(self.iteritems(), reverse=True)
            k,v = items.pop(0)
            while 1:
                if (item.start < k < item.stop):
                    d[k] = v
                elif k > item.stop:
                    break
                if item.step:
                    for x in range(item.step):
                        k,v = items.pop(0)
                else:
                    k,v = items.pop(0)
            return d
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

    def __add__(self, other):
        """If attempting addition, use our length as the 'value'."""
        return len(self) + other

    def __radd__(self, other):
        """If the other type does not support addition with us, this addition method will be tried."""
        return len(self) + other

    def append(self, item):
        """Add the item to the dict, giving it a higher integer key than any currently in use."""
        largestKey = sorted(self.keys())[-1]
        if isinstance(largestKey, str):
            self.__setitem__(0, item)
        elif isinstance(largestKey, int):
            self.__setitem__(largestKey+1, item)

    def count(self, item):
        """Count the number of keys with the specified item."""
        return sum([1 for x in self.items() if x == item])

    def __eq__(self, other):
        """od.__eq__(y) <==> od==y. Comparison to another AV is order-sensitive
        while comparison to a regular mapping is order-insensitive. """
        if isinstance(other, AutoVivification):
            return len(self)==len(other) and self.items() == other.items()
        return dict.__eq__(self, other)

    def __ne__(self, other):
        """od.__ne__(y) <==> od!=y"""
        return not self == other

这遵循为无用密钥动态生成自身的基本自动生存功能。不过，它还实现了此处列出的一些方法。这使得它的行为就像一个准列表/字典的东西。

对于列表中的其余功能，请添加列出的方法。我将其视为具有列表方法的字典。如果调用列表方法，则它会对所保存的项目的顺序做出假设，即字符串排序低于整数，并且键始终按“排序”顺序。

它还支持添加，作为这些方法的示例。这来自我自己的用例。我需要从 AutoVivified 字典中添加项目，但如果它不存在，则会创建并返回一个新的 AutoVivification 对象。它们没有整数“值”，所以你不能这样做：

rp = AutoVivification()
rp['a']['b'] = 3
rp['a']['b'] + rp['q']

这违背了目的，因为我不知道是否会有某些东西存在，但无论如何我想要一个默认值。所以我向其中添加了 __add__ 和 __radd__ 方法。它们使用底层字典的长度作为整数值，因此新创建的 AV 对象的加法值为零。如果一个键中除了 AV 对象之外还有其他东西，那么我们会得到该东西的添加方法（如果实现了）。

Since we don't know beforehand if we need a dictionary, or a list, then you can't combine autovivification with lists. Unless, Working off of Nosklo's answer from the linked question, you add list "features" to the underlying dictionary. Basically assuming a "sort" order for keys, and always using it with the list methods. I've done this as an example:

class AutoVivification(dict):
    """Implementation of perl's autovivification feature. Has features from both dicts and lists,
    dynamically generates new subitems as needed, and allows for working (somewhat) as a basic type.
    """
    def __getitem__(self, item):
        if isinstance(item, slice):
            d = AutoVivification()
            items = sorted(self.iteritems(), reverse=True)
            k,v = items.pop(0)
            while 1:
                if (item.start < k < item.stop):
                    d[k] = v
                elif k > item.stop:
                    break
                if item.step:
                    for x in range(item.step):
                        k,v = items.pop(0)
                else:
                    k,v = items.pop(0)
            return d
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

    def __add__(self, other):
        """If attempting addition, use our length as the 'value'."""
        return len(self) + other

    def __radd__(self, other):
        """If the other type does not support addition with us, this addition method will be tried."""
        return len(self) + other

    def append(self, item):
        """Add the item to the dict, giving it a higher integer key than any currently in use."""
        largestKey = sorted(self.keys())[-1]
        if isinstance(largestKey, str):
            self.__setitem__(0, item)
        elif isinstance(largestKey, int):
            self.__setitem__(largestKey+1, item)

    def count(self, item):
        """Count the number of keys with the specified item."""
        return sum([1 for x in self.items() if x == item])

    def __eq__(self, other):
        """od.__eq__(y) <==> od==y. Comparison to another AV is order-sensitive
        while comparison to a regular mapping is order-insensitive. """
        if isinstance(other, AutoVivification):
            return len(self)==len(other) and self.items() == other.items()
        return dict.__eq__(self, other)

    def __ne__(self, other):
        """od.__ne__(y) <==> od!=y"""
        return not self == other

This follows the basic autovivification feature of dynamically generating itself for dud keys. However, it also implements some of the methods listed here. This allows it to act like a quasi list/dict thing.

For the rest of the list features, add in the listed methods. I'm treating it as a dictionary with the list methods. If a list method is called, then it makes an assumption about the order of the items held, namely that strings sort lower than integers, and that keys are always in "sorted" order.

It also supports adding, as an example of these methods. This comes from my own use case. I needed to add items from a AutoVivified dictionary, but if it doesn't exist, a new AutoVivification object is created and returned. They have no integer "value" and so you can't do this:

rp = AutoVivification()
rp['a']['b'] = 3
rp['a']['b'] + rp['q']

That defeats the purpose, since I don't know if some thing is going to be there, but I want a default anyway. So I've added the __add__ and __radd__ methods to it. They use the length of the underlying dictionary as the integer value, so a newly created AV object has a value of zero for addition. If a key has something besides an AV object in it, then we get that thing's addition method, if implemented.

回复收藏 0 原文

软糖 2024-09-01 16:42:22

只是扩展 Ignacio 的答案，提供一些额外的选项，允许您从 Python 字典中明确请求更多神奇的行为。以这种方式编写的代码的可维护性仍然值得怀疑，但我想清楚地表明这是一个“这种数据结构可维护吗？”的问题。（我有疑问）而不是“Python 可以这样做吗？” （当然可以）。

为了支持命名空间方面的任意级别的嵌套，您所要做的就是命名函数（而不是使用 lambda）并使其自引用：

>>> from collections import defaultdict
>>> def autodict(): return defaultdict(autodict)
...
>>> a = autodict()
>>> a[1][2][3] = []
>>> type(a[1])
<class 'collections.defaultdict'>
>>> type(a[1][2])
<class 'collections.defaultdict'>
>>> type(a[1][2][3])
<class 'list'>
>>> a[1][2][3]
[]

但是，这引入了您必须在之前显式设置列表的“问题”您可以附加到它。 Python 的答案在于 setdefault 方法，该方法实际上比 collections.defaultdict 存在的时间更长：

>>> a.setdefault(3, []).append(10)
>>> a.setdefault(3, []).append(11)
>>> a[3]
[10, 11]
>>> a[2].setdefault(3, []).append(12)
>>> a[2].setdefault(3, []).append(13)
>>> a[2][3]
[12, 13]
>>> a[1][2].setdefault(3, []).append(14)
>>> a[1][2].setdefault(3, []).append(15)
>>> a[1][2][3]
[14, 15]

collections.defaultdict 真正做的就是 make在常见情况下，您总是将相同的第二个参数传递给 dict.setdefault 更容易使用。对于更复杂的情况，例如本例，您仍然可以直接使用 dict.setdefault 来处理 collections.defaultdict 无法处理的方面。

Just expanding on Ignacio's answer to present some additional options that allow you to explicitly request more magical behaviour from Python dictionaries. The maintainability of code written this way remains dubious, but I wanted to make it crystal clear that it is a question of "Is this kind of data structure maintainable?" (I have my doubts) not "Can Python be made to behave this way?" (it certainly can).

To support arbitrary levels of nesting for the namespace aspect, all you have to do is name the function (instead of using lambda) and make it self-referential:

>>> from collections import defaultdict
>>> def autodict(): return defaultdict(autodict)
...
>>> a = autodict()
>>> a[1][2][3] = []
>>> type(a[1])
<class 'collections.defaultdict'>
>>> type(a[1][2])
<class 'collections.defaultdict'>
>>> type(a[1][2][3])
<class 'list'>
>>> a[1][2][3]
[]

However, this introduces the "problem" that you have to set a list explicitly before you can append to it. Python's answer to that lies in the setdefault method, which has actually been around longer than collections.defaultdict:

>>> a.setdefault(3, []).append(10)
>>> a.setdefault(3, []).append(11)
>>> a[3]
[10, 11]
>>> a[2].setdefault(3, []).append(12)
>>> a[2].setdefault(3, []).append(13)
>>> a[2][3]
[12, 13]
>>> a[1][2].setdefault(3, []).append(14)
>>> a[1][2].setdefault(3, []).append(15)
>>> a[1][2][3]
[14, 15]

All collections.defaultdict really does is make the common case where you always pass the same second parameter to dict.setdefault much easier to use. For more complex cases, such as this one, you can still use dict.setdefault directly for the aspects that collections.defaultdict can't handle.

回复收藏 0 原文

一萌ing 2024-09-01 16:42:22

这是我的版本，我将其称为 ArrayHash。它的行为就像一个数组，除非您使用字符串作为键，然后它开始表现得像一个散列。它支持初始化、与 + 和 += 连接以及在 LHS 和 RHS 上切片。在数组模式下，如果引用或分配给 arrayhash[N]，它将用 None 值填充 0..N-1。在哈希模式下，它会停止这样做。它以 True 回答 isinstance(arrayhash, dict) ，即使处于哈希模式， isinstance(arrayhash, collections.abc.Sequence) 也给出 True 。我很想得到您的反馈！

from collections import defaultdict
import collections.abc

class _ArrayHash(defaultdict, collections.abc.Sequence):
    def append(self, value):
        self[len(self)] = value

    def extend(self, lst):
        ln = len(self)
        for item in lst:
            self[ln] = item
            ln += 1

    def update(self, values):
        self.isHash = True
        for k, v in values.items():
            self[k] = v

    def __getitem__(self, index):
        if isinstance(index, slice):
            return [self[i] for i in range(*index.indices(len(self)))]
        elif isinstance(index, int):
            if index < 0:
                index += len(self)
            return super().__getitem__(index)
        else:
            self.isHash = True
            return super().__getitem__(index)

    def __setitem__(self, index, value):
        if isinstance(index, slice):
            try:
                if not hasattr(self, 'isHash'):
                    for i in range(len(self), index.start):
                        super().__setitem__(i, None)
                value = iter(value)
                ndx = index.start
                for i in range(*index.indices(len(self))):
                    super().__setitem__(i, next(value))
                    ndx += 1
                rest = list(value)
                lr = len(rest)
                if lr:
                    for i in range(len(self)-1,ndx-1,-1):  # Move everything else up
                        super().__setitem__(i+lr, super().__getitem__(i))
                for i in range(lr):
                    super().__setitem__(i+ndx, rest[i])
            except StopIteration:
                pass
        elif isinstance(index, int):
            if index < 0:
                index += len(self)
            if not hasattr(self, 'isHash'):
                for i in range(len(self), index):
                    super().__setitem__(i, None)
            super().__setitem__(index, value)
        else:
            self.isHash = True
            super().__setitem__(index, value)

    def __iter__(self):
        if hasattr(self, 'isHash'):
            for i in self.keys():
                yield i
        else:
            for i in range(len(self)):
                yield self[i]

    def __add__(self, other):
        result = ArrayHash(self)
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            result.update(other)
        else:
            result.extend(other)
        return result

    def __iadd__(self, other):
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            self.update(other)
        else:
            self.extend(other)
        return self

    def __radd__(self, other):
        result = ArrayHash()
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            result.update(other)
            result.update(self)
        else:
            result.extend(other)
            result.extend(self)
        return result


def ArrayHash(init=None):
    """Acts like an array with autovivification, unless you use a string as a key, then it becomes a hash"""
    result = _ArrayHash(ArrayHash)
    if init is not None:
        if isinstance(init, collections.abc.Sequence) and not isinstance(init, str):
            result.extend(init)
        elif isinstance(init, _ArrayHash):
            if(hasattr(init, 'isHash')):
                result.update(init)
            else:
                result.extend(init)
        elif isinstance(init, dict):
            result.update(init)
        else:
            result.append(init)
    return result

Here is my version, which I call ArrayHash. It behaves like an array unless you use a string as a key, then it starts behaving like a hash. It supports initialization, concat with + and +=, and slicing both on the LHS and RHS. In array mode, if you reference or assign to arrayhash[N], it fills in from 0..N-1 with None values. In hash mode it stops doing that. It answers isinstance(arrayhash, dict) with True, and isinstance(arrayhash, collections.abc.Sequence) gives True even if it's in hash mode. I'd love to get your feedback!

from collections import defaultdict
import collections.abc

class _ArrayHash(defaultdict, collections.abc.Sequence):
    def append(self, value):
        self[len(self)] = value

    def extend(self, lst):
        ln = len(self)
        for item in lst:
            self[ln] = item
            ln += 1

    def update(self, values):
        self.isHash = True
        for k, v in values.items():
            self[k] = v

    def __getitem__(self, index):
        if isinstance(index, slice):
            return [self[i] for i in range(*index.indices(len(self)))]
        elif isinstance(index, int):
            if index < 0:
                index += len(self)
            return super().__getitem__(index)
        else:
            self.isHash = True
            return super().__getitem__(index)

    def __setitem__(self, index, value):
        if isinstance(index, slice):
            try:
                if not hasattr(self, 'isHash'):
                    for i in range(len(self), index.start):
                        super().__setitem__(i, None)
                value = iter(value)
                ndx = index.start
                for i in range(*index.indices(len(self))):
                    super().__setitem__(i, next(value))
                    ndx += 1
                rest = list(value)
                lr = len(rest)
                if lr:
                    for i in range(len(self)-1,ndx-1,-1):  # Move everything else up
                        super().__setitem__(i+lr, super().__getitem__(i))
                for i in range(lr):
                    super().__setitem__(i+ndx, rest[i])
            except StopIteration:
                pass
        elif isinstance(index, int):
            if index < 0:
                index += len(self)
            if not hasattr(self, 'isHash'):
                for i in range(len(self), index):
                    super().__setitem__(i, None)
            super().__setitem__(index, value)
        else:
            self.isHash = True
            super().__setitem__(index, value)

    def __iter__(self):
        if hasattr(self, 'isHash'):
            for i in self.keys():
                yield i
        else:
            for i in range(len(self)):
                yield self[i]

    def __add__(self, other):
        result = ArrayHash(self)
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            result.update(other)
        else:
            result.extend(other)
        return result

    def __iadd__(self, other):
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            self.update(other)
        else:
            self.extend(other)
        return self

    def __radd__(self, other):
        result = ArrayHash()
        if hasattr(self, 'isHash') or (isinstance(other, dict) and not isinstance(other, _ArrayHash)) or hasattr(other, 'isHash'):
            result.update(other)
            result.update(self)
        else:
            result.extend(other)
            result.extend(self)
        return result


def ArrayHash(init=None):
    """Acts like an array with autovivification, unless you use a string as a key, then it becomes a hash"""
    result = _ArrayHash(ArrayHash)
    if init is not None:
        if isinstance(init, collections.abc.Sequence) and not isinstance(init, str):
            result.extend(init)
        elif isinstance(init, _ArrayHash):
            if(hasattr(init, 'isHash')):
                result.update(init)
            else:
                result.extend(init)
        elif isinstance(init, dict):
            result.update(init)
        else:
            result.append(init)
    return result

回复收藏 0 原文

~没有更多了~