Python 中的嵌套字典,隐式创建不存在的中间容器?

发布于 2024-09-28 12:42:22 字数 2908 浏览 0 评论 0原文

我想创建一个多态结构,可以以最少的打字工作量即时创建,并且非常具有可读性。例如:

a.b = 1
a.c.d = 2
a.c.e = 3
a.f.g.a.b.c.d = cucu
a.aaa = bau

我不想创建一个中间容器,例如:

a.c = subobject()
a.c.d = 2
a.c.e = 3

我的问题与此类似:

实现嵌套字典的最佳方法是什么?

但我对那里的解决方案不满意,因为我认为存在一个错误:
即使您不想要,也会创建项目:假设您想比较 2 个多态结构:它将在第二个结构中创建第一个结构中存在的任何属性,并且仅在另一个结构中进行检查。例如:

a = {1:2, 3: 4}
b = {5:6}

# now compare them:

if b[1] == a[1]
    # whoops, we just created b[1] = {} !

我也想获得

a.b.c.d = 1
    # neat
a[b][c][d] = 1
    # yuck

我尝试从对象类派生的最简单的可能符号...但我无法避免留下与上面相同的错误,其中属性仅通过尝试读取它们而诞生:一个简单的目录() 会尝试创建像“方法”这样的属性...就像这个例子一样,这显然已经被破坏了:

class KeyList(object):
    def __setattr__(self, name, value):
        print "__setattr__ Name:", name, "value:", value
        object.__setattr__(self, name, value)
    def __getattribute__(self, name):
        print "__getattribute__ called for:", name
        return object.__getattribute__(self, name)
    def __getattr__(self, name):
        print "__getattr__ Name:", name
        try:
            ret = object.__getattribute__(self, name)
        except AttributeError:
            print "__getattr__ not found, creating..."
            object.__setattr__(self, name, KeyList())
            ret = object.__getattribute__(self, name)
        return ret

>>> cucu = KeyList()
>>> dir(cucu)
__getattribute__ called for: __dict__
__getattribute__ called for: __members__
__getattr__ Name: __members__
__getattr__ not found, creating...
__getattribute__ called for: __methods__
__getattr__ Name: __methods__
__getattr__ not found, creating...
__getattribute__ called for: __class__

谢谢,真的!

ps:到目前为止我找到的最好的解决方案是:

class KeyList(dict):
    def keylset(self, path, value):
        attr = self
        path_elements = path.split('.')
        for i in path_elements[:-1]:
            try:
                attr = attr[i]
            except KeyError:
                attr[i] = KeyList()
                attr = attr[i]
        attr[path_elements[-1]] = value

# test
>>> a = KeyList()
>>> a.keylset("a.b.d.e", "ferfr")
>>> a.keylset("a.b.d", {})
>>> a
{'a': {'b': {'d': {}}}}

# shallow copy
>>> b = copy.copy(a)
>>> b
{'a': {'b': {'d': {}}}}
>>> b.keylset("a.b.d", 3)
>>> b
{'a': {'b': {'d': 3}}}
>>> a
{'a': {'b': {'d': 3}}}

# complete copy
>>> a.keylset("a.b.d", 2)
>>> a
{'a': {'b': {'d': 2}}}
>>> b
{'a': {'b': {'d': 2}}}
>>> b = copy.deepcopy(a)
>>> b.keylset("a.b.d", 4)
>>> b
{'a': {'b': {'d': 4}}}
>>> a
{'a': {'b': {'d': 2}}}

I want to create a polymorphic structure that can be created on the fly with minimum typing effort and be very readable. For example:

a.b = 1
a.c.d = 2
a.c.e = 3
a.f.g.a.b.c.d = cucu
a.aaa = bau

I do not want to create an intermediate container such as:

a.c = subobject()
a.c.d = 2
a.c.e = 3

My question is similar to this one:

What is the best way to implement nested dictionaries?

But I am not happy with the solution there because I think there is a bug:
Items will be created even when you don't want: suppose you want to compare 2 polymorphic structures: it will create in the 2nd structure any attribute that exists in the first and is just checked in the other. e.g:

a = {1:2, 3: 4}
b = {5:6}

# now compare them:

if b[1] == a[1]
    # whoops, we just created b[1] = {} !

I also want to get the simplest possible notation

a.b.c.d = 1
    # neat
a[b][c][d] = 1
    # yuck

I did try to derive from the object class... but I couldn't avoid to leave the same bug as above where attributes were born just by trying to read them: a simple dir() would try to create attributes like "methods"... like in this example, which is obviously broken:

class KeyList(object):
    def __setattr__(self, name, value):
        print "__setattr__ Name:", name, "value:", value
        object.__setattr__(self, name, value)
    def __getattribute__(self, name):
        print "__getattribute__ called for:", name
        return object.__getattribute__(self, name)
    def __getattr__(self, name):
        print "__getattr__ Name:", name
        try:
            ret = object.__getattribute__(self, name)
        except AttributeError:
            print "__getattr__ not found, creating..."
            object.__setattr__(self, name, KeyList())
            ret = object.__getattribute__(self, name)
        return ret

>>> cucu = KeyList()
>>> dir(cucu)
__getattribute__ called for: __dict__
__getattribute__ called for: __members__
__getattr__ Name: __members__
__getattr__ not found, creating...
__getattribute__ called for: __methods__
__getattr__ Name: __methods__
__getattr__ not found, creating...
__getattribute__ called for: __class__

Thanks, really!

p.s.: the best solution I found so far is:

class KeyList(dict):
    def keylset(self, path, value):
        attr = self
        path_elements = path.split('.')
        for i in path_elements[:-1]:
            try:
                attr = attr[i]
            except KeyError:
                attr[i] = KeyList()
                attr = attr[i]
        attr[path_elements[-1]] = value

# test
>>> a = KeyList()
>>> a.keylset("a.b.d.e", "ferfr")
>>> a.keylset("a.b.d", {})
>>> a
{'a': {'b': {'d': {}}}}

# shallow copy
>>> b = copy.copy(a)
>>> b
{'a': {'b': {'d': {}}}}
>>> b.keylset("a.b.d", 3)
>>> b
{'a': {'b': {'d': 3}}}
>>> a
{'a': {'b': {'d': 3}}}

# complete copy
>>> a.keylset("a.b.d", 2)
>>> a
{'a': {'b': {'d': 2}}}
>>> b
{'a': {'b': {'d': 2}}}
>>> b = copy.deepcopy(a)
>>> b.keylset("a.b.d", 4)
>>> b
{'a': {'b': {'d': 4}}}
>>> a
{'a': {'b': {'d': 2}}}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

尘世孤行 2024-10-05 12:42:22

如果您正在寻找的内容不像原始帖子那么动态,但更像是迄今为止最好的解决方案,您可能会看到 Ian Bicking 的 formencodevariabledecode 可以满足您的需求。该包本身旨在用于 Web 表单和验证,但其中一些方法似乎非常接近您正在寻找的内容。
如果不出意外,它可以作为您自己的实现的示例。

一个小例子:

>>> from formencode.variabledecode import variable_decode, variable_encode
>>>
>>> d={'a.b.c.d.e': 1}
>>> variable_decode(d)
{'a': {'b': {'c': {'d': {'e': 1}}}}}
>>>
>>> d['a.b.x'] = 3
>>> variable_decode(d)
{'a': {'b': {'c': {'d': {'e': 1}}, 'x': 3}}}
>>>
>>> d2 = variable_decode(d)
>>> variable_encode(d2) == d
True

If you're looking for something that's not as dynamic as your original post, but more like your best solution so far, you might see if Ian Bicking's formencode's variabledecode would meet your needs. The package itself is intended for web forms and validation, but a few of the methods seem pretty close to what you're looking for.
If nothing else, it might serve as an example for your own implementation.

A small example:

>>> from formencode.variabledecode import variable_decode, variable_encode
>>>
>>> d={'a.b.c.d.e': 1}
>>> variable_decode(d)
{'a': {'b': {'c': {'d': {'e': 1}}}}}
>>>
>>> d['a.b.x'] = 3
>>> variable_decode(d)
{'a': {'b': {'c': {'d': {'e': 1}}, 'x': 3}}}
>>>
>>> d2 = variable_decode(d)
>>> variable_encode(d2) == d
True
┼── 2024-10-05 12:42:22

我认为至少您需要在 __getattr__ 中进行检查,确保请求的属性不以 __ 开头和结尾。与该描述匹配的属性实现已建立的 Python API,因此您不应该实例化这些属性。即便如此,您最终仍然会实现一些 API 属性,例如 next。在这种情况下,如果将对象传递给某个使用鸭子类型来查看它是否是迭代器的函数,最终会抛出异常。

最好创建一个有效属性名称的“白名单”,无论是作为文字集,还是使用简单的公式:例如 name.isalpha() 和 len(name) == 1 会适用于您在示例中使用的单字母属性。为了更实际的实现,您可能需要定义一组适合您的代码正在工作的域的名称。

我想另一种选择是确保您不会动态创建属于该域的任何各种属性名称某些协议的一部分,因为 next 是迭代协议的一部分。 集合中的 ABC 方法module 包含部分列表,但我不知道在哪里可以找到完整的列表。

您还必须跟踪该对象是否创建了任何此类子节点,以便您知道如何与其他此类对象进行比较。

如果您希望进行比较以避免自动生存,则必须实现 __cmp__ 方法,或 丰富的比较方法,在检查被比较对象的__dict__的类中。

我隐隐感觉有一些我没有想到的复杂情况,这并不奇怪,因为这并不是 Python 真正应该工作的方式。仔细思考,考虑一下这种方法所增加的复杂性是否值得它所带来的好处。

I think at a minimum you need to do a check in __getattr__ that the requested attrib doesn't start and end with __. Attributes which match that description implement established Python APIs, so you shouldn't be instantiating those attributes. Even so you'll still end up implementing some API attribs, like for example next. In that case you would end up with an exception being thrown if you pass the object to some function that uses duck typing to see if it's an iterator.

It would really be better to create a "whitelist" of valid attrib names, either as a literal set, or with a simple formula: e.g. name.isalpha() and len(name) == 1 would work for the one-letter attribs you're using in the example. For a more realistic implementation you'd probably want to define a set of names appropriate to the domain your code is working in.

I guess the alternative is to make sure that you're not dynamically creating any of the various attribute names that are part of some protocol, as next is part of the iteration protocol. The methods of the ABCs in the collections module comprise a partial list, but I don't know where to find a full one.

You are also going to have to keep track of whether or not the object has created any such child nodes, so that you will know how to do comparisons against other such objects.

If you want comparisons to avoid autovivification, you will have to implement a __cmp__ method, or rich comparison methods, in the class that checks the __dict__s of the objects being compared.

I have a sneaking feeling that there's a few complications I didn't think of, which wouldn't be surprising since this is not really how Python is supposed to work. Tread carefully, and think about whether the added complexity of this approach is worth what it will get you.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文