在 Python 中，类名的自动完全限定是如何工作的？ [与物体酸洗相关]

发布于 2024-11-07 12:21:55 字数 2613 浏览 8 评论 0原文

（可以直接跳到问题，进一步向下，并跳过介绍。）

从用户定义的类中 pickling Python 对象有一个常见的困难：

# This is program dumper.py
import pickle

class C(object):
    pass

with open('obj.pickle', 'wb') as f:
    pickle.dump(C(), f)

事实上，试图从另一个程序取回对象loader.py 的

# This is program loader.py
with open('obj.pickle', 'rb') as f:
    obj = pickle.load(f)

结果为

AttributeError: 'module' object has no attribute 'C'

事实上，该类是按名称（“C”）腌制的，并且 loader.py 程序不知道有关 C.常见的解决方案包括导入 with

from dumper import C  # Objects of class C can be imported

with open('obj.pickle', 'rb') as f:
    obj = pickle.load(f)

但是，此解决方案有一些缺点，包括必须导入 pickle 对象引用的所有类（可能有很多）；此外，本地命名空间会被 dumper.py 程序中的名称污染。

现在，解决方案包括在酸洗之前完全限定对象：

# New dumper.py program:
import pickle
import dumper  # This is this very program!

class C(object):
    pass

with open('obj.pickle', 'wb') as f:
    pickle.dump(dumper.C(), f)  # Fully qualified class

使用上面的原始 loader.py 程序取消酸洗现在可以直接工作（无需执行 from dumper import C ）。

问题：现在，来自 dumper.py 的其他类似乎在酸洗时自动完全合格，我很想知道这是如何工作的，以及这是否可靠，记录的行为：

import pickle
import dumper  # This is this very program!

class D(object):  # New class!
    pass

class C(object):
    def __init__(self):
        self.d = D()  # *NOT* fully qualified

with open('obj.pickle', 'wb') as f:
    pickle.dump(dumper.C(), f)  # Fully qualified pickle class

现在，使用原始 loader.py 程序进行 unpickle 也可以工作（无需执行 from dumper import C）； print obj.d 给出了一个完全限定的类，这让我感到惊讶：

<dumper.D object at 0x122e130>

这种行为非常方便，因为只有顶部的 pickle 对象才必须通过模块完全限定名称（dumper.C()）。但这种行为可靠且有记录吗？为什么类是按名称（“D”）腌制的，但取消腌制决定腌制的 self.d 属性属于类 dumper.D （而不是某些本地的） D 类）？

PS：提炼出的问题：我刚刚注意到一些有趣的细节，它们可能指向这个问题的答案：

在酸洗程序 dumper.py 中，print self.d 使用第一个 dumper.py 程序（没有 import dumper 的程序）在 0x2af450> 处打印 <__main__.D 对象代码>）。另一方面，执行 import dumper 并在 dumper.py 中使用 dumper.C() 创建对象会使 print self. d print ：self.d 属性由 Python 自动限定！因此，看来 pickle 模块在上述良好的 unpickling 行为中没有任何作用。

因此，问题实际上是：在第二种情况下，为什么 Python 将 D() 转换为完全限定的 dumper.D？这有记录在某处吗？

原文

(It is possible to directly jump to the question, further down, and to skip the introduction.)

There is a common difficulty with pickling Python objects from user-defined classes:

# This is program dumper.py
import pickle

class C(object):
    pass

with open('obj.pickle', 'wb') as f:
    pickle.dump(C(), f)

In fact, trying to get the object back from another program loader.py with

# This is program loader.py
with open('obj.pickle', 'rb') as f:
    obj = pickle.load(f)

results in

AttributeError: 'module' object has no attribute 'C'

In fact, the class is pickled by name ("C"), and the loader.py program does not know anything about C. A common solution consists in importing with

from dumper import C  # Objects of class C can be imported

with open('obj.pickle', 'rb') as f:
    obj = pickle.load(f)

However, this solution has a few drawbacks, including the fact that all the classes referenced by the pickled objects have to be imported (there can be many); furthermore, the local namespace becomes polluted by names from the dumper.py program.

Now, a solution to this consists of fully qualifying objects prior to pickling:

# New dumper.py program:
import pickle
import dumper  # This is this very program!

class C(object):
    pass

with open('obj.pickle', 'wb') as f:
    pickle.dump(dumper.C(), f)  # Fully qualified class

Unpickling with the original loader.py program above now works directly (no need to do from dumper import C).

Question: Now, other classes from dumper.py seem to be automatically fully qualified upon pickling, and I would love to know how this works, and whether this is a reliable, documented behavior:

import pickle
import dumper  # This is this very program!

class D(object):  # New class!
    pass

class C(object):
    def __init__(self):
        self.d = D()  # *NOT* fully qualified

with open('obj.pickle', 'wb') as f:
    pickle.dump(dumper.C(), f)  # Fully qualified pickle class

Now, unpickling with the original loader.py program also works (no need to do from dumper import C); print obj.d gives a fully qualified class, which I find surprising:

<dumper.D object at 0x122e130>

This behavior is very convenient, since only the top, pickled object has to be fully qualified with the module name (dumper.C()). But is this behavior reliable and documented? how come that classes are pickled by name ("D") but that the unpickling decides that the pickled self.d attribute is of class dumper.D (and not some local D class)?

PS: The question, refined: I just noticed a few interesting details that might point to an answer to this question:

In the pickling program dumper.py, print self.d prints <__main__.D object at 0x2af450>, with the first dumper.py program (the one without import dumper). On the other hand, doing import dumper and creating the object with dumper.C() in dumper.py makes print self.d print <dumper.D object at 0x2af450>: the self.d attribute is automatically qualified by Python! So, it appears that the pickle module has no role in the nice unpickling behavior described above.

The question is thus really: why does Python convert D() into the fully qualified dumper.D, in the second case? is this documented somewhere?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

神回复 2024-11-14 12:21:55

当您的类在主模块中定义时，pickle 期望在未腌制它们时找到它们。在第一种情况下，类是在主模块中定义的，因此当 loader 运行时，loader 是主模块，而 pickle 找不到这些类。如果您查看 obj.pickle 的内容，您将看到名称 __main__ 导出为 C 和 D 类的命名空间。

在第二种情况下， dumper.py 会自行导入。现在，您实际上已经定义了两组独立的 C 和 D 类：一组在 __main__ 命名空间中，一组在 dumper 命名空间中。您可以在 dumper 命名空间中序列化该文件（查看 obj.pickle 进行验证）。

如果找不到名称空间，pickle 将尝试动态导入名称空间，因此当 loader.py 运行时，pickle 本身会导入 dumper.py 以及 dumper.C 和 dumper.D 类。

由于您有两个单独的脚本 dumper.py 和 loader.py，因此只有在公共导入模块中定义它们共享的类才有意义：

common.py

class D(object):
    pass

class C(object):
    def __init__(self):
        self.d = D()

loader.py

import pickle

with open('obj.pickle','rb') as f:
    obj = pickle.load(f)

print obj

dumper.py

import pickle
from common import C

with open('obj.pickle','wb') as f:
    pickle.dump(C(),f)

请注意，即使 dumper.py 转储 C() 在这种情况下，pickle 知道它是一个 common.C 对象（参见 obj.pickle）。当loader.py运行时，它会动态导入common.py并成功加载该对象。

When your classes are defined in your main module, that's where pickle expects to find them when they are unpickled. In your first case, the classes were defined in the main module, so when loader runs, loader is the main module and pickle can't find the classes. If you look at the content of obj.pickle, you'll see then name __main__ exported as the namespace of your C and D classes.

In your second case, dumper.py imports itself. Now you actually have two separate sets of C and D classes defined: one set in __main__ namespace and one set in dumper namespace. You serialize the one in the dumper namespace (look in obj.pickle to verify).

pickle will attempt to dynamically import a namespace if it is not found, so when loader.py runs pickle itself imports dumper.py and the dumper.C and dumper.D classes.

Since you have two separate scripts, dumper.py and loader.py, it only makes sense to define the classes they share in a common import module:

common.py

class D(object):
    pass

class C(object):
    def __init__(self):
        self.d = D()

loader.py

import pickle

with open('obj.pickle','rb') as f:
    obj = pickle.load(f)

print obj

dumper.py

import pickle
from common import C

with open('obj.pickle','wb') as f:
    pickle.dump(C(),f)

Note that even though dumper.py dumps C() in this case pickle knows that it is a common.C object (see obj.pickle). When loader.py runs, it will dynamically import common.py and succeed loading the object.

回复收藏 0 原文

忘你却要生生世世 2024-11-14 12:21:55

发生的情况如下：从 dumper.py 中导入 dumper（或执行 from dumper import C）时，整个程序是再次解析（这可以通过在模块中插入打印来看到）。此行为是预期的，因为 dumper 不是已加载的模块（但是 __main__ 被视为已加载）——它不在 sys.modules.

正如 Mark 的回答所示，导入模块自然会限定模块中定义的所有名称，因此 self.d = D() 被解释为类 dumper.D 重新评估文件 dumper.py 时（这相当于在 Mark 的回答中解析 common.py）。

因此，解释了 import dumper（或 from dumper import C）技巧，并且 pickling 不仅完全限定了类 C，还完全限定了类 >D。这使得外部程序的 unpickle 变得更容易！

这也表明，在 dumper.py 中执行的 import dumper 强制 Python 解释器解析程序两次，这既不高效也不优雅。因此，在一个程序中腌制类并在另一个程序中取消它们可能最好通过马克的回答中概述的方法来完成：腌制的类应该位于单独的模块中。