如何在大型对象上查找 Python Pickle 中的错误源

发布于 2024-11-19 03:28:31 字数 411 浏览 1 评论 0原文

我已经接管了某人的一个相当大的项目的代码。我正在尝试保存程序状态,并且有一个巨大的对象存储了几乎所有其他对象。我正在尝试腌制这个对象,但出现以下错误:

pickle.PicklingError:无法pickle:找不到内置.module

从我在谷歌上找到的内容来看,这是因为我在 python init 之外导入了一些东西,或者是一个类属性正在引用一个模块。所以,我有两个问题:

  1. 任何人都可以确认这就是出现此错误的原因吗?我是否在代码中寻找正确的内容?

  2. 有没有办法找到哪行代码/对象成员导致了pickle中的困难?回溯仅给出 pickle 中发生错误的行,而不是被 pickle 的对象的行。

I've taken over somebody's code for a fairly large project. I'm trying to save program state, and there's one massive object which stores pretty much all the other objects. I'm trying to pickle this object, but I get this error:

pickle.PicklingError: Can't pickle : it's not found as builtin.module

From what I can find on google, this is because somewhere I'm importing something outside of python init, or that a class attribute is referencing a module. So, I've got a two questions:

  1. Can anybody confirm that that's why this error is being given? Am I looking for the right things in my code?

  2. Is there a way to find what line of code/object member is causing the difficulties in pickle? The traceback only gives the line in pickle where the error occurs, not the line of the object being pickled.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦里兽 2024-11-26 03:28:31

2)您可以将 pickle.Pickler 子类化并对其进行猴子修补以显示其腌制内容的日志。这应该可以更容易地追踪问题所在。

import pickle
class MyPickler (pickle.Pickler):
    def save(self, obj):
        print 'pickling object', obj, 'of type', type(obj)
        pickle.Pickler.save(self, obj)

这仅适用于 pickle.Pickler 的 Python 实现。在Python 3.x中,pickle模块默认使用C实现,Pickler的纯Python版本称为_Pickler。

# Python 3.x
import pickle
class MyPickler (pickle._Pickler):
    def save(self, obj):
        print ('pickling object  {0} of type {1}'.format(obj, type(obj))
        pickle._Pickler.save(self, obj)

2) You can subclass pickle.Pickler and monkey-patch it to show a log of what it's pickling. This should make it easier to trace where the problem is.

import pickle
class MyPickler (pickle.Pickler):
    def save(self, obj):
        print 'pickling object', obj, 'of type', type(obj)
        pickle.Pickler.save(self, obj)

This will only work with the Python implementation of pickle.Pickler. In Python 3.x, the pickle module uses the C implementation by default, the pure-Python version of Pickler is called _Pickler.

# Python 3.x
import pickle
class MyPickler (pickle._Pickler):
    def save(self, obj):
        print ('pickling object  {0} of type {1}'.format(obj, type(obj))
        pickle._Pickler.save(self, obj)
城歌 2024-11-26 03:28:31

dill 中存在类似的东西。让我们看一下对象列表,看看我们能做什么:

>>> import dill
>>> f = open('whatever', 'w')
>>> f.close()
>>> 
>>> l = [iter([1,2,3]), xrange(5), open('whatever', 'r'), lambda x:x]
>>> dill.detect.trace(False)
>>> dill.pickles(l)
False

好吧,dill 无法对列表进行 pickle。那么问题出在哪里呢?

>>> dill.detect.trace(True)
>>> dill.pickles(l)
T4: <type 'listiterator'>
False

好的,列表中的第一项无法腌制。剩下的呢?

>>> map(dill.pickles, l)
T4: <type 'listiterator'>
Si: xrange(5)
F2: <function _eval_repr at 0x106991cf8>
Fi: <open file 'whatever', mode 'r' at 0x10699c810>
F2: <function _create_filehandle at 0x106991848>
B2: <built-in function open>
F1: <function <lambda> at 0x1069f6848>
F2: <function _create_function at 0x1069916e0>
Co: <code object <lambda> at 0x105a0acb0, file "<stdin>", line 1>
F2: <function _unmarshal at 0x106991578>
D1: <dict object at 0x10591d168>
D2: <dict object at 0x1069b1050>
[False, True, True, True]

嗯。其他物体腌制得很好。那么,让我们替换第一个对象。

>>> dill.detect.trace(False)
>>> l[0] = xrange(1,4)
>>> dill.pickles(l)
True
>>> _l = dill.loads(dill.dumps(l))

现在我们的对象泡菜。好吧,我们可以利用在 linux/unix/mac 上进行 pickle 时发生的一些内置对象共享……那么更强的检查怎么样,比如跨子进程的实际 pickle(就像在 Windows 上发生的那样)?

>>> dill.check(l)        
[xrange(1, 4), xrange(5), <open file 'whatever', mode 'r' at 0x107998810>, <function <lambda> at 0x1079ec410>]
>>> 

不,列表仍然有效......所以这是一个可以成功发送到另一个进程的对象。

现在,关于您的错误,每个人似乎都忽略了......

ModuleType 对象不可pickle,这导致了您的错误。

>>> import types
>>> types.ModuleType 
<type 'module'>
>>>
>>> import pickle
>>> pickle.dumps(types.ModuleType)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global
    (obj, module, name))
pickle.PicklingError: Can't pickle <type 'module'>: it's not found as __builtin__.module

然而,如果我们导入dill,它就会神奇地起作用。

>>> import dill
>>> pickle.dumps(types.ModuleType)
"cdill.dill\n_load_type\np0\n(S'ModuleType'\np1\ntp2\nRp3\n."
>>> 

Something like this exists in dill. Let's look at a list of objects, and see what we can do:

>>> import dill
>>> f = open('whatever', 'w')
>>> f.close()
>>> 
>>> l = [iter([1,2,3]), xrange(5), open('whatever', 'r'), lambda x:x]
>>> dill.detect.trace(False)
>>> dill.pickles(l)
False

Ok, dill fails to pickle the list. So what's the problem?

>>> dill.detect.trace(True)
>>> dill.pickles(l)
T4: <type 'listiterator'>
False

Ok, the first item in the list fails to pickle. What about the rest?

>>> map(dill.pickles, l)
T4: <type 'listiterator'>
Si: xrange(5)
F2: <function _eval_repr at 0x106991cf8>
Fi: <open file 'whatever', mode 'r' at 0x10699c810>
F2: <function _create_filehandle at 0x106991848>
B2: <built-in function open>
F1: <function <lambda> at 0x1069f6848>
F2: <function _create_function at 0x1069916e0>
Co: <code object <lambda> at 0x105a0acb0, file "<stdin>", line 1>
F2: <function _unmarshal at 0x106991578>
D1: <dict object at 0x10591d168>
D2: <dict object at 0x1069b1050>
[False, True, True, True]

Hm. The other objects pickle just fine. So, let's replace the first object.

>>> dill.detect.trace(False)
>>> l[0] = xrange(1,4)
>>> dill.pickles(l)
True
>>> _l = dill.loads(dill.dumps(l))

Now our object pickles. Well, we could be taking advantage of some built-in object sharing that happens for pickling on linux/unix/mac… so what about a stronger check, like actually pickling across a sub-process (like happens on windows)?

>>> dill.check(l)        
[xrange(1, 4), xrange(5), <open file 'whatever', mode 'r' at 0x107998810>, <function <lambda> at 0x1079ec410>]
>>> 

Nope, the list still works… so this is an object that could be sent to another process successfully.

Now, with regard to your error, which everyone seemed to ignore…

The ModuleType object is not pickleable, and that's causing your error.

>>> import types
>>> types.ModuleType 
<type 'module'>
>>>
>>> import pickle
>>> pickle.dumps(types.ModuleType)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global
    (obj, module, name))
pickle.PicklingError: Can't pickle <type 'module'>: it's not found as __builtin__.module

However, if we import dill, it magically works.

>>> import dill
>>> pickle.dumps(types.ModuleType)
"cdill.dill\n_load_type\np0\n(S'ModuleType'\np1\ntp2\nRp3\n."
>>> 
与之呼应 2024-11-26 03:28:31

作为查找导致问题的对象属性/成员的快速而肮脏的方法,您可以尝试:

for k, v in massiveobject.__dict__.iteritems():
    print k
    pickle.dumps(v)

As a quick-and-dirty way to find what attribute/member of the object is causing the problem, you could try:

for k, v in massiveobject.__dict__.iteritems():
    print k
    pickle.dumps(v)
ゞ记忆︶ㄣ 2024-11-26 03:28:31

1)与您发现的略有不同。这是由引用模块类型(不是直接模块)的某些变量(类属性、列表或字典项,它可以是任何东西)引起的问题。此代码应该重现该问题:

import pickle
pickle.dumps(type(pickle))

1) There's a slight difference from what you've found. This is a problem caused by some variable (class attribute, list or dict item, it could be anything) that is referencing the module type (not a module directly). This code should reproduce the issue:

import pickle
pickle.dumps(type(pickle))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文