Python:无法pickle模块对象错误

发布于 2024-08-31 21:53:06 字数 173 浏览 6 评论 0原文

我正在尝试腌制一个大班并得到

类型错误:无法腌制模块对象

TypeError:尽管浏览了网络,但 ,我无法确切地弄清楚这意味着什么。我不确定哪个模块对象造成了麻烦。有办法找到罪魁祸首吗?堆栈跟踪似乎没有表明任何内容。

I'm trying to pickle a big class and getting

TypeError: can't pickle module objects

despite looking around the web, I can't exactly figure out what this means. and I'm not sure which module object is causing the trouble. is there a way to find the culprit? the stack trace doesn't seem to indicate anything.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

野心澎湃 2024-09-07 21:53:06

Python 无法pickle 模块对象才是真正的问题。有充分的理由吗?我不这么认为。模块对象不可选取会导致 Python 作为并行/异步语言的脆弱性。如果你想pickle模块对象,或者Python中的几乎任何东西,那么使用dill

Python 3.2.5 (default, May 19 2013, 14:25:55) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import os
>>> dill.dumps(os)
b'\x80\x03cdill.dill\n_import_module\nq\x00X\x02\x00\x00\x00osq\x01\x85q\x02Rq\x03.'
>>>
>>>
>>> # and for parlor tricks...
>>> class Foo(object):
...   x = 100
...   def __call__(self, f):
...     def bar(y):
...       return f(self.x) + y
...     return bar
... 
>>> @Foo()
... def do_thing(x):
...   return x
... 
>>> do_thing(3)
103 
>>> dill.loads(dill.dumps(do_thing))(3)
103
>>> 

在此处获取 dillhttps://github.com/uqfoundation/dill

Python's inability to pickle module objects is the real problem. Is there a good reason? I don't think so. Having module objects unpicklable contributes to the frailty of python as a parallel / asynchronous language. If you want to pickle module objects, or almost anything in python, then use dill.

Python 3.2.5 (default, May 19 2013, 14:25:55) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import os
>>> dill.dumps(os)
b'\x80\x03cdill.dill\n_import_module\nq\x00X\x02\x00\x00\x00osq\x01\x85q\x02Rq\x03.'
>>>
>>>
>>> # and for parlor tricks...
>>> class Foo(object):
...   x = 100
...   def __call__(self, f):
...     def bar(y):
...       return f(self.x) + y
...     return bar
... 
>>> @Foo()
... def do_thing(x):
...   return x
... 
>>> do_thing(3)
103 
>>> dill.loads(dill.dumps(do_thing))(3)
103
>>> 

Get dill here: https://github.com/uqfoundation/dill

农村范ル 2024-09-07 21:53:06

我可以通过这种方式重现错误消息:

import cPickle

class Foo(object):
    def __init__(self):
        self.mod=cPickle

foo=Foo()
with file('/tmp/test.out', 'w') as f:
    cPickle.dump(foo, f) 

# TypeError: can't pickle module objects

Do you have a class attribute thatreferences a module?

I can reproduce the error message this way:

import cPickle

class Foo(object):
    def __init__(self):
        self.mod=cPickle

foo=Foo()
with file('/tmp/test.out', 'w') as f:
    cPickle.dump(foo, f) 

# TypeError: can't pickle module objects

Do you have a class attribute that references a module?

∞觅青森が 2024-09-07 21:53:06

递归查找 Pickle 失败

wump 评论的启发:
Python:无法pickle模块对象错误

这是一些快速代码这帮助我递归地找到了罪魁祸首。

它检查有问题的对象,看看它是否酸洗失败。

然后迭代尝试对 __dict__ 中的键进行 pickle,返回仅失败的 pickling 列表。

代码片段

import pickle

def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output

示例 程序

import redis

import pickle
from pprint import pformat as pf


def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output


if __name__ == "__main__":
    r = redis.Redis()
    print(pf(pickle_trick(r)))

示例输出

$ python3 pickle-trick.py
{'depth': 10,
 'err': TypeError("can't pickle _thread.lock objects"),
 'fail': Redis<ConnectionPool<Connection<host=localhost,port=6379,db=0>>>,
 'failing_children': [{'depth': 9,
                       'err': TypeError("can't pickle _thread.lock objects"),
                       'fail': ConnectionPool<Connection<host=localhost,port=6379,db=0>>,
                       'failing_children': [{'depth': 8,
                                             'err': TypeError("can't pickle _thread.lock objects"),
                                             'fail': <unlocked _thread.lock object at 0x10bb58300>,
                                             'failing_children': []},
                                            {'depth': 8,
                                             'err': TypeError("can't pickle _thread.RLock objects"),
                                             'fail': <unlocked _thread.RLock object owner=0 count=0 at 0x10bb58150>,
                                             'failing_children': []}]},
                      {'depth': 9,
                       'err': PicklingError("Can't pickle <function Redis.<lambda> at 0x10c1e8710>: attribute lookup Redis.<lambda> on redis.client failed"),
                       'fail': {'ACL CAT': <function Redis.<lambda> at 0x10c1e89e0>,
                                'ACL DELUSER': <class 'int'>,
0x10c1e8170>,
                                .........
                                'ZSCORE': <function float_or_none at 0x10c1e5d40>},
                       'failing_children': []}]}

根本原因 - Redis 无法 pickle _thread.lock

在我的例子中,创建保存为对象属性的 Redis 实例会破坏 pickle。

当您创建 Redis 实例时,它还会创建一个 Threadsconnection_pool 并且线程锁无法被 pickle。

在进行 pickle 之前,我必须在 multiprocessing.Process 中创建并清理 Redis

测试

在我的例子中,我试图腌制的类必须能够腌制。因此,我添加了一个单元测试,该测试创建该类的实例并对其进行腌制。这样,如果有人修改了该类,使其无法被 pickle,从而破坏了它在多处理(和 pyspark)中使用的能力,我们将检测到该回归并立即知道。

def test_can_pickle():
    # Given
    obj = MyClassThatMustPickle()

    # When / Then
    pkl = pickle.dumps(obj)

    # This test will throw an error if it is no longer pickling correctly

Recursively Find Pickle Failure

Inspired by wump's comment:
Python: can't pickle module objects error

Here is some quick code that helped me find the culprit recursively.

It checks the object in question to see if it fails pickling.

Then iterates trying to pickle the keys in __dict__ returning the list of only failed picklings.

Code Snippet

import pickle

def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output

Example Program

import redis

import pickle
from pprint import pformat as pf


def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output


if __name__ == "__main__":
    r = redis.Redis()
    print(pf(pickle_trick(r)))

Example Output

$ python3 pickle-trick.py
{'depth': 10,
 'err': TypeError("can't pickle _thread.lock objects"),
 'fail': Redis<ConnectionPool<Connection<host=localhost,port=6379,db=0>>>,
 'failing_children': [{'depth': 9,
                       'err': TypeError("can't pickle _thread.lock objects"),
                       'fail': ConnectionPool<Connection<host=localhost,port=6379,db=0>>,
                       'failing_children': [{'depth': 8,
                                             'err': TypeError("can't pickle _thread.lock objects"),
                                             'fail': <unlocked _thread.lock object at 0x10bb58300>,
                                             'failing_children': []},
                                            {'depth': 8,
                                             'err': TypeError("can't pickle _thread.RLock objects"),
                                             'fail': <unlocked _thread.RLock object owner=0 count=0 at 0x10bb58150>,
                                             'failing_children': []}]},
                      {'depth': 9,
                       'err': PicklingError("Can't pickle <function Redis.<lambda> at 0x10c1e8710>: attribute lookup Redis.<lambda> on redis.client failed"),
                       'fail': {'ACL CAT': <function Redis.<lambda> at 0x10c1e89e0>,
                                'ACL DELUSER': <class 'int'>,
0x10c1e8170>,
                                .........
                                'ZSCORE': <function float_or_none at 0x10c1e5d40>},
                       'failing_children': []}]}

Root Cause - Redis can't pickle _thread.lock

In my case, creating an instance of Redis that I saved as an attribute of an object broke pickling.

When you create an instance of Redis it also creates a connection_pool of Threads and the thread locks can not be pickled.

I had to create and clean up Redis within the multiprocessing.Process before it was pickled.

Testing

In my case, the class that I was trying to pickle, must be able to pickle. So I added a unit test that creates an instance of the class and pickles it. That way if anyone modifies the class so it can't be pickled, therefore breaking it's ability to be used in multiprocessing (and pyspark), we will detect that regression and know straight away.

def test_can_pickle():
    # Given
    obj = MyClassThatMustPickle()

    # When / Then
    pkl = pickle.dumps(obj)

    # This test will throw an error if it is no longer pickling correctly

傲世九天 2024-09-07 21:53:06

根据 文档 :

什么可以腌制和取消腌制?

可以对以下类型进行腌制:

  • 无、正确和错误
  • 整数、浮点数、复数
  • 字符串、字节、字节数组
  • 仅包含可pickl对象的元组、列表、集合和字典
  • 在模块顶层定义的函数(使用 def,而不是 lambda)
  • 在模块顶层定义的内置函数
  • 在模块顶层定义的类
  • 此类的实例,其 __dict__ 或调用 __getstate__() 的结果是可选取的(有关详细信息,请参阅选取类实例部分)。

如您所见,模块不属于此列表。请注意,使用 deepcopy 时也是如此,而不仅仅是 pickle 模块,如 deepcopy 文档中所述:

此模块不会复制模块、方法、堆栈跟踪、堆栈帧、文件、套接字、窗口、数组或任何类似类型等类型。它通过不改变地返回原始对象来“复制”函数和类(浅层和深层);这与 pickle 模块处理这些的方式兼容。

一种可能的解决方法是使用 @property 装饰器而不是属性。
例如,这应该有效:

    import numpy as np
    import pickle
    
    class Foo():
        @property
        def module(self):
            return np
    
    foo = Foo()
    with open('test.out', 'wb') as f:
        pickle.dump(foo, f)


 

According to the documentation:

What can be pickled and unpickled?

The following types can be pickled:

  • None, True, and False
  • integers, floating point numbers, complex numbers
  • strings, bytes, bytearrays
  • tuples, lists, sets, and dictionaries containing only picklable objects
  • functions defined at the top level of a module (using def, not lambda)
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module
  • instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).

As you can see, modules are not part of this list. Note, that this is also true when using deepcopy and not only for the pickle module, as stated in the documentation of deepcopy:

This module does not copy types like module, method, stack trace, stack frame, file, socket, window, array, or any similar types. It does “copy” functions and classes (shallow and deeply), by returning the original object unchanged; this is compatible with the way these are treated by the pickle module.

A possible workaround is using the @property decorator instead of an attribute.
For example, this should work:

    import numpy as np
    import pickle
    
    class Foo():
        @property
        def module(self):
            return np
    
    foo = Foo()
    with open('test.out', 'wb') as f:
        pickle.dump(foo, f)


 
云淡月浅 2024-09-07 21:53:06

@Flask 2.x 用户
“TypeError:无法pickle模块对象”

如果您在尝试使用@dataclass装饰器显示模型时遇到此错误,请确保您使用lazy= db.relationship() 中的“joined”

@Flask 2.x users
with "TypeError: can't pickle module objects"

If you have this error when trying to display your model using @dataclass decorator, ensure you are using lazy='joined' in your db.relationship()

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文