更改模块目录后的 Python pickle
我最近更改了程序的目录布局:之前,我将所有模块都放在“main”文件夹中。现在,我已将它们移动到以程序命名的目录中,并在其中放置一个 __init__.py
来制作一个包。
现在,我的主目录中有一个 .py 文件,用于启动我的程序,这更加简洁。
不管怎样,尝试从我的程序的早期版本加载腌制文件失败了。我收到“ImportError:没有名为工具的模块” - 我猜这是因为我的模块以前位于主文件夹中,现在位于 Whyteboard.tools 中,而不仅仅是普通的工具。但是,在工具模块中导入的代码与其位于同一目录中,因此我怀疑是否需要指定一个包。
所以,我的程序目录看起来像这样:
whyteboard-0.39.4
-->whyteboard.py
-->README.txt
-->CHANGELOG.txt
---->whyteboard/
---->whyteboard/__init__.py
---->whyteboard/gui.py
---->whyteboard/tools.py
Whyteboard.py 从 Whyteboard/gui.py 启动一段代码,并触发上GUI。在目录重新组织之前,这个酸洗问题肯定不会发生。
I've recently changed my program's directory layout: before, I had all my modules inside the "main" folder. Now, I've moved them into a directory named after the program, and placed an __init__.py
there to make a package.
Now I have a single .py file in my main directory that is used to launch my program, which is much neater.
Anyway, trying to load in pickled files from previous versions of my program is failing. I'm getting, "ImportError: No module named tools" - which I guess is because my module was previously in the main folder, and now it's in whyteboard.tools, not simply plain tools. However, the code that is importing in the tools module lives in the same directory as it, so I doubt there's a need to specify a package.
So, my program directory looks something like this:
whyteboard-0.39.4
-->whyteboard.py
-->README.txt
-->CHANGELOG.txt
---->whyteboard/
---->whyteboard/__init__.py
---->whyteboard/gui.py
---->whyteboard/tools.py
whyteboard.py launches a block of code from whyteboard/gui.py, that fires up the GUI. This pickling problem definitely wasn't happening before the directory re-organizing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
正如 pickle 的文档所说,为了保存和恢复类实例(实际上也是一个函数),您必须遵守某些约束:
whyteboard.tools
不是“与”tools
“相同的模块”(即使它可以通过import导入tools
被同一包中的其他模块调用,它最终以sys.modules['whyteboard.tools']
的形式出现在 sys.modules 中:这绝对是至关重要的,否则同一包中的一个模块与另一个包中的一个模块导入的同一模块最终会出现多个条目,并且可能存在冲突!)。如果您的 pickle 文件采用良好/高级格式(而不是仅出于兼容性原因而默认的旧 ascii 格式),那么在执行此类更改后迁移它们实际上可能不那么简单作为“编辑文件”(这是二进制的&c...!),尽管另一个答案表明了这一点。相反,我建议您制作一个小“pickle 迁移脚本”:让它像这样修补
sys.modules
...:然后
cPickle.load
每个文件、del sys.modules['tools']
和cPickle.dump
将每个加载的对象返回到文件:sys.modules
应该让 pickles 成功加载,然后再次转储它们应该为实例的类使用正确的模块名称(删除额外的条目应该确保这一点)。As pickle's docs say, in order to save and restore a class instance (actually a function, too), you must respect certain constraints:
whyteboard.tools
is not the "the same module as"tools
(even though it can be imported byimport tools
by other modules in the same package, it ends up insys.modules
assys.modules['whyteboard.tools']
: this is absolutely crucial, otherwise the same module imported by one in the same package vs one in another package would end up with multiple and possibly conflicting entries!).If your pickle files are in a good/advanced format (as opposed to the old ascii format that's the default only for compatibility reasons), migrating them once you perform such changes may in fact not be quite as trivial as "editing the file" (which is binary &c...!), despite what another answer suggests. I suggest that, instead, you make a little "pickle-migrating script": let it patch
sys.modules
like this...:and then
cPickle.load
each file,del sys.modules['tools']
, andcPickle.dump
each loaded object back to file: that temporary extra entry insys.modules
should let the pickles load successfully, then dumping them again should be using the right module-name for the instances' classes (removing that extra entry should make sure of that).这可以通过使用
的自定义“unpickler”来完成find_class()
:那么您需要使用
renamed_load()
而不是pickle.load()
和renamed_loads()
而不是pickle.loads()
。This can be done with a custom "unpickler" that uses
find_class()
:Then you'd need to use
renamed_load()
instead ofpickle.load()
andrenamed_loads()
instead ofpickle.loads()
.发生在我身上,通过在加载 pickle 之前将模块的新位置添加到 sys.path 来解决它:
Happened to me, solved it by adding the new location of the module to sys.path before loading pickle:
pickle
通过引用序列化类,因此如果您更改类的生存位置,它将不会取消pickle,因为将找不到该类。如果您使用dill
而不是pickle
,那么您可以通过引用或直接序列化类(通过直接序列化类而不是其导入路径)。您只需在dump
之后和load
之前更改类定义即可轻松模拟这一点。pickle
serializes classes by reference, so if you change were the class lives, it will not unpickle because the class will not be found. If you usedill
instead ofpickle
, then you can serialize classes by reference or directly (by directly serializing the class instead of it's import path). You simulate this pretty easily by just changing the class definition after adump
and before aload
.这是 pickle 的正常行为,unpickled 对象需要有它们的 定义可导入模块。
您应该能够通过编辑 pickled 文件来更改模块路径(即从
tools
到whyteboard.tools
),因为它们通常是简单的文本文件。This is the normal behavior of pickle, unpickled objects need to have their defining module importable.
You should be able to change the modules path (i.e. from
tools
towhyteboard.tools
) by editing the pickled files, as they are normally simple text files.对于像我这样需要更新大量 pickle 转储的人,这里有一个实现 @Alex Martelli 的极好的建议的函数:
在我的例子中,转储是 PyTorch 模型检查点。因此注释掉了
torch.load/save()
。示例
For people like me needing to update lots of pickle dumps, here's a function implementing @Alex Martelli's excellent advice:
In my case, the dumps were PyTorch model checkpoints. Hence the commented-out
torch.load/save()
.Example
当您尝试加载包含类引用的 pickle 文件时,必须遵循保存 pickle 时相同的结构。如果你想在其他地方使用pickle,你必须告诉这个类或其他对象在哪里;因此,执行以下操作可以挽救这一天:
When you try to load a pickle file that contain a class reference, you must respect the same structure when you saved the pickle. If you want use the pickle somewhere else, you have to tell where this class or other object is; so do this below you can save the day:
我知道这已经有一段时间了,但这为我解决了这个问题:
本质上,使用完整的导入路径(例如
concurrent.run_concurrent
),而不仅仅是模块名称(例如run_concurrent
) code>)共享代码:
原始(错误):
替换为以下内容(删除对
module_name
的所有引用):I know this has been a while, but this fixed it for me:
Essentially, use full import path (eg.
concurrent.run_concurrent
) instead of just the module name (eg.run_concurrent
)Shared Code:
Original (bad):
Replace with the following (remove all references to
module_name
):