为什么发电机不能腌制?

发布于 2024-12-01 06:10:07 字数 421 浏览 3 评论 0原文

Python 的 pickle(我在这里说的是标准 Python 2.5/2.6/2.7)不能 pickle 锁、文件对象等。

它也不能 pickle 生成器和 lambda 表达式(或任何其他匿名代码),因为 pickle 实际上只存储名称引用。

对于锁和依赖于操作系统的功能,不能 pickle 它们的原因是显而易见且有意义的。

但是为什么你不能pickle生成器?


注意:只是为了清楚起见——我对根本原因感兴趣(或设计决策中的假设和选择) 为什么,而不是“因为它会给你一个 Pickle 错误”。

我意识到这个问题有点宽泛,所以这里有一个经验法则来决定你是否回答它:“如果提出这些假设,或者允许的发电机类型在某种程度上受到更多限制,酸洗发电机会再次工作吗?”

Python's pickle (I'm talking standard Python 2.5/2.6/2.7 here) cannot pickle locks, file objects etc.

It also cannot pickle generators and lambda expressions (or any other anonymous code), because the pickle really only stores name references.

In case of locks and OS-dependent features, the reason why you cannot pickle them is obvious and makes sense.

But why can't you pickle generators?


Note: just for clarity -- I'm interested in the fundamental reason (or assumptions and choices that went into that design decision) why, not in "because it gives you a Pickle error".

I realize the question's a bit wide-aimed, so here's a rule of thumb of whether your answered it: "If these assumptions were raised, or the type of allowed generator somehow more restricted, would pickling generators work again?"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一指流沙 2024-12-08 06:10:07

有很多关于此的信息。有关该问题的“官方说法”,请阅读(已关闭)Python bugtracker 问题

做出该决定的人之一详细说明了核心推理 此博客

由于生成器本质上是一个增强的函数,因此我们需要保存它的字节码(这不能保证在 Python 版本之间向后兼容)及其框架(保存生成器的状态,例如局部变量) 、闭包和指令指针。后者实现起来相当麻烦,因为它基本上需要使整个解释器变得可腌制。因此,任何对 pickling 生成器的支持都需要对 CPython 的核心进行大量更改。

现在,如果 pickle 不支持的对象(例如,文件句柄、套接字、数据库连接等)出现在生成器的局部变量中,则无论 pickle 支持如何,该生成器都无法自动 pickle我们可能会实现的生成器。因此在这种情况下,您仍然需要提供自定义的 __getstate__ 和 __setstate__ 方法。这个问题使得对生成器的任何酸洗支持都相当有限。

并提到了两个建议的解决方法:

无论如何,如果您需要这样的功能,请查看 Stackless Python,它可以完成上述所有操作。由于 Stackless 的解释器是可挑选的,因此您还可以免费获得流程迁移。这意味着您可以中断一个tasklet(Stackless 的绿色线程的名称),对其进行pickle,将pickle 发送到另一台机器,对其进行unpickle,恢复该tasklet,然后瞧,您刚刚迁移了一个进程。这是一个非常酷的功能!

但以我的拙见,解决此问题的最佳解决方案是将生成器重写为简单的迭代器(即具有 __next__ 方法的迭代器)。迭代器在空间上可以轻松高效地进行 pickle,因为它们的状态是明确的。然而,您仍然需要显式地处理表示某些外部状态的对象;你无法回避这个问题。

There is lots of information about this available. For the "official word" on the issue, read the (closed) Python bugtracker issue.

The core reasoning, by one of the people who made the decision, is detailed on this blog:

Since a generator is essentially a souped-up function, we would need to save its bytecode, which is not guarantee to be backward-compatible between Python’s versions, and its frame, which holds the state of the generator such as local variables, closures and the instruction pointer. And this latter is rather cumbersome to accomplish, since it basically requires to make the whole interpreter picklable. So, any support for pickling generators would require a large number of changes to CPython’s core.

Now if an object unsupported by pickle (e.g., a file handle, a socket, a database connection, etc) occurs in the local variables of a generator, then that generator could not be pickled automatically, regardless of any pickle support for generators we might implement. So in that case, you would still need to provide custom __getstate__ and __setstate__ methods. This problem renders any pickling support for generators rather limited.

And two suggested workarounds are mentioned:

Anyway, if you need for a such feature, then look into Stackless Python which does all the above. And since Stackless’s interpreter is picklable, you also get process migration for free. This means you can interrupt a tasklet (the name for Stackless’s green threads), pickle it, send the pickle to a another machine, unpickle it, resume the tasklet, and voilà you’ve just migrated a process. This is freaking cool feature!

But in my humble opinion, the best solution to this problem to the rewrite the generators as simple iterators (i.e., one with a __next__ method). Iterators are easy and efficient space-wise to pickle because their state is explicit. You would still need to handle objects representing some external state explicitly however; you cannot get around this.

红尘作伴 2024-12-08 06:10:07

实际上可以,具体取决于实施情况。 PyPyStackless Python两者都允许这样做(无论如何在某种程度上):

Python 2.7.1 (dcae7aed462b, Aug 17 2011, 09:46:15)
[PyPy 1.6.0 with GCC 4.0.1] on darwin
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``Not your usual analyses.''
>>>> import pickle
>>>> gen = (x for x in range(100))
>>>> next(gen)
0
>>>> pickled = pickle.dumps(gen)
>>>> next(pickle.loads(pickled))
1

在 CPython 中也可以创建一个 迭代器对象来模拟可选取的生成器。

You actually can, depending on the implementation. PyPy and Stackless Python both allow this (to some degree anyway):

Python 2.7.1 (dcae7aed462b, Aug 17 2011, 09:46:15)
[PyPy 1.6.0 with GCC 4.0.1] on darwin
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``Not your usual analyses.''
>>>> import pickle
>>>> gen = (x for x in range(100))
>>>> next(gen)
0
>>>> pickled = pickle.dumps(gen)
>>>> next(pickle.loads(pickled))
1

In CPython it's also possible to create an iterator object to simulate a pickable generator.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文