Python:用一些不可腌制的项目腌制一个字典
我有一个对象 gui_project ,它有一个属性 .namespace ,它是一个命名空间字典。 (即从字符串到对象的字典。)
(这在类似 IDE 的程序中使用,让用户在 Python shell 中定义自己的对象。)
我想 pickle 这个 gui_project 以及命名空间。问题是,命名空间中的某些对象(即 .namespace
字典的值)不是可挑选的对象。例如,其中一些引用了 wxPython 小部件。
我想过滤掉不可腌制的对象,即将它们从腌制版本中排除。
我该怎么做?
(我尝试过的一件事是逐一处理这些值并尝试腌制它们,但是发生了一些无限递归,我需要避免这种情况。)
(我确实实现了一个 GuiProject.__getstate__
现在的方法,摆脱除 namespace
之外的其他不可挑选的东西。)
I have an object gui_project
which has an attribute .namespace
, which is a namespace dict. (i.e. a dict from strings to objects.)
(This is used in an IDE-like program to let the user define his own object in a Python shell.)
I want to pickle this gui_project
, along with the namespace. Problem is, some objects in the namespace (i.e. values of the .namespace
dict) are not picklable objects. For example, some of them refer to wxPython widgets.
I'd like to filter out the unpicklable objects, that is, exclude them from the pickled version.
How can I do this?
(One thing I tried is to go one by one on the values and try to pickle them, but some infinite recursion happened, and I need to be safe from that.)
(I do implement a GuiProject.__getstate__
method right now, to get rid of other unpicklable stuff besides namespace
.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我将使用pickler对持久对象引用的记录支持。持久对象引用是由 pickle 引用但未存储在 pickle 中的对象。
http://docs.python.org/library/pickle .html#pickling-and-unpickling-external-objects
ZODB 使用此 API 已有多年,因此非常稳定。当 unpickle 时,您可以将对象引用替换为您喜欢的任何内容。在您的情况下,您可能希望将对象引用替换为指示对象无法被腌制的标记。
您可以从这样的事情开始(未经测试):
然后只需调用 dump_filtered() 和 load_filtered() 而不是 pickle.dump() 和 pickle.load()。 wxPython 对象将被pickle为持久ID,并在unpickle时被FilteredObjects替换。
您可以通过过滤掉不属于内置类型且没有 __getstate__ 方法的对象来使解决方案更加通用。
更新(2010 年 11 月 15 日):这是一种使用包装类实现相同目标的方法。使用包装类而不是子类,可以保留在记录的 API 中。
I would use the pickler's documented support for persistent object references. Persistent object references are objects that are referenced by the pickle but not stored in the pickle.
http://docs.python.org/library/pickle.html#pickling-and-unpickling-external-objects
ZODB has used this API for years, so it's very stable. When unpickling, you can replace the object references with anything you like. In your case, you would want to replace the object references with markers indicating that the objects could not be pickled.
You could start with something like this (untested):
Then just call dump_filtered() and load_filtered() instead of pickle.dump() and pickle.load(). wxPython objects will be pickled as persistent IDs, to be replaced with FilteredObjects at unpickling time.
You could make the solution more generic by filtering out objects that are not of the built-in types and have no
__getstate__
method.Update (15 Nov 2010): Here is a way to achieve the same thing with wrapper classes. Using wrapper classes instead of subclasses, it's possible to stay within the documented API.
这就是我要做的事情(我之前做了类似的事情并且它有效):
制作一个字符串列表,其中每个字符串都是合法的 python 代码,这样
当所有这些字符串按顺序执行时,您将获得所需的变量
现在,当您取消pickle时,您将返回所有最初可pickle的变量。对于所有不可 pickle 的变量,您现在拥有一个字符串列表(合法的 python 代码),按顺序执行时,将为您提供所需的变量。
希望这有帮助
This is how I would do this (I did something similar before and it worked):
make a list of strings, where each string is legal python code, such that
when all these strings are executed in order, you get the desired variable
Now, when you unpickle, you get back all the variables that were originally pickleable. For all variables that were not pickleable, you now have a list of strings (legal python code) that when executed in order, gives you the desired variable.
Hope this helps
我最终使用 Shane Hathaway 的方法编写了自己的解决方案。
这是代码。 (查找
CutePickler
和CuteUnpickler
。)这是测试。它是 GarlicSim 的一部分,因此您可以通过 安装garlicsim
并执行from Garlicsim.general_misc import pickle_tools
。如果您想在 Python 3 代码上使用它,请使用 Python 3 fork of
garlicsim
。I ended up coding my own solution to this, using Shane Hathaway's approach.
Here's the code. (Look for
CutePickler
andCuteUnpickler
.) Here are the tests. It's part of GarlicSim, so you can use it by installinggarlicsim
and doingfrom garlicsim.general_misc import pickle_tools
.If you want to use it on Python 3 code, use the Python 3 fork of
garlicsim
.一种方法是继承
pickle.Pickler
,并重写save_dict()
方法。从基类中复制它,其内容如下:但是,在 _batch_setitems 中,传递一个迭代器来过滤掉您不想转储的所有项目,例如,
由于 save_dict 不是官方 API,因此您需要检查对于每个新的 Python 版本,此覆盖是否仍然正确。
One approach would be to inherit from
pickle.Pickler
, and override thesave_dict()
method. Copy it from the base class, which reads like this:However, in the _batch_setitems, pass an iterator that filters out all items that you don't want to be dumped, e.g
As save_dict isn't an official API, you need to check for each new Python version whether this override is still correct.
过滤部分确实很棘手。使用简单的技巧,您可以轻松地让泡菜发挥作用。但是,您最终可能会过滤掉太多内容,并丢失当过滤器看起来更深入时可以保留的信息。但是,
.namespace
中最终出现的事物的可能性很大,这使得构建一个好的过滤器变得困难。但是,我们可以利用 Python 中已有的部分,例如
copy
模块中的deepcopy
。我制作了 Stock
copy
模块的副本,并执行了以下操作:LostObject
的新类型来表示将在酸洗中丢失的对象。_deepcopy_atomic
以确保x
是可挑选的。如果不是,则返回LostObject
__reduce__
和/或__reduce_ex__
来提供有关是否以及如何对其进行 pickle 的提示。我们确保这些方法不会抛出异常以提供无法对其进行腌制的提示。以下是差异:
现在回到酸洗部分。您只需使用这个新的
deepcopy
函数进行深度复制,然后 pickle 副本即可。不可酸洗的部分已在复制过程中被移除。这是输出:
您会看到 1) 相互指针(在
x
和xx
之间)被保留,并且我们不会遇到无限循环; 2)不可picklable文件对象被转换为LostObject
实例; 3) 不会创建大对象的新副本,因为它是可picklable的。The filtering part is indeed tricky. Using simple tricks, you can easily get the pickle to work. However, you might end up filtering out too much and losing information that you could keep when the filter looks a little bit deeper. But the vast possibility of things that can end up in the
.namespace
makes building a good filter difficult.However, we could leverage pieces that are already part of Python, such as
deepcopy
in thecopy
module.I made a copy of the stock
copy
module, and did the following things:LostObject
to represent object that will be lost in pickling._deepcopy_atomic
to make surex
is picklable. If it's not, return an instance ofLostObject
__reduce__
and/or__reduce_ex__
to provide hint about whether and how to pickle it. We make sure these methods will not throw exception to provide hint that it cannot be pickled.The following is the diff:
Now back to the pickling part. You simply make a deepcopy using this new
deepcopy
function and then pickle the copy. The unpicklable parts have been removed during the copying process.Here is the output:
You see that 1) mutual pointers (between
x
andxx
) are preserved and we do not run into infinite loop; 2) the unpicklable file object is converted to aLostObject
instance; and 3) not new copy of the large object is created since it is picklable.