我目前正在用 Python 编写一个序列化模块,可以序列化用户定义的类。为了做到这一点,我需要获取对象的完整名称空间并将其写入文件。然后我可以使用该字符串重新创建该对象。
例如,假设我们
class B:
class C:
pass
现在在名为 A.py
的文件中有以下类结构,并假设 my_klass_string
是字符串 "A::B:: C"
klasses = my_klass_string.split("::")
if globals().has_key(klasses[0]):
klass = globals()[klasses[0]]
else:
raise TypeError, "No class defined: %s} " % klasses[0]
if len(klasses) > 1:
for klass_string in klasses:
if klass.__dict__.has_key(klass_string):
klass = klass.__dict__[klass_string]
else:
raise TypeError, "No class defined: %s} " % klass_string
klass_obj = klass.__new__(klass)
我可以创建类 C 的实例,即使它位于模块 A
中的类 B
下。
上面的代码相当于调用eval(klass_obj = ABC__new__(ABC))
注意:
我在这里使用 __new__() 是因为我正在重构一个序列化对象,并且我不想初始化该对象,因为我不知道该类的 __init__ 参数是什么> 方法采取。我想在不调用 init 的情况下创建对象,然后稍后为其分配属性。
我可以通过任何方式从字符串创建 ABC 类的对象。我该如何走另一条路?即使该类是嵌套的,如何从该类的实例获取描述该类的完整路径的字符串?
I'm currently writing a serialization module in Python that can serialize user defined classes. in order to do this I need to get the full name space of the object and write it to a file. I can then use that string to recreate the object.
for example assume that we have the following class structure in a file named A.py
class B:
class C:
pass
now with the assumption that my_klass_string
is the string "A::B::C"
klasses = my_klass_string.split("::")
if globals().has_key(klasses[0]):
klass = globals()[klasses[0]]
else:
raise TypeError, "No class defined: %s} " % klasses[0]
if len(klasses) > 1:
for klass_string in klasses:
if klass.__dict__.has_key(klass_string):
klass = klass.__dict__[klass_string]
else:
raise TypeError, "No class defined: %s} " % klass_string
klass_obj = klass.__new__(klass)
I can create an instance of the class C even though it lies under class B
in the module A
.
the above code is equivalent to calling eval(klass_obj = A.B.C.__new__(A.B.C))
note:
I'm using __new__()
here because I'm reconstituting a serialized object and I don't want to init the object as I don't know what parameters the class's __init__
methods takes. I want to create the object with out calling init and then assign attributes to it later.
any way I can create an object of class A.B.C
from a string. bout how do I go the other way? how to I get a string that describes the full path to the class from an instance of that class even if the class is nested?
发布评论
评论(4)
您无法获得“给定实例的类的完整路径”
class”,因为Python中没有这样的东西。对于
例如,以您的示例为基础:
x
的“类路径”应该是什么?D
还是BC
?两者都是同样有效,因此 Python 没有给你任何告诉你的方法
来自另一个。
事实上,即使是 Python 的
pickle
模块在腌制对象x
时也会遇到麻烦:因此,一般来说,除了添加属性之外,我没有其他选择
到你的所有类(例如,
_class_path
),你的序列化代码会查找它将类名记录为序列化格式:
您甚至可以使用 中的其他评论如果您执行上面的
D=BC
操作,则可能适用相同的 SO 帖子)。也就是说,如果您可以将序列化代码限制为 (1) 个实例
的新式类,并且(2)这些类是在
模块的顶层,那么你可以复制
pickle
所做的事情(来自 Python 的 pickle.py 中第 730--768 行的函数
save_global
2.6)。
这个想法是每个新式类都定义属性
__name__
和
__module__
,它们是扩展为类名的字符串(如在源中找到)和模块名称(如在
sys.modules
);通过保存这些,您可以稍后导入模块并获取该类的一个实例:
You cannot get the "full path to the class given an instance of the
class", for the reason that there is no such thing in Python. For
instance, building on your example:
What should the "class path" of
x
be?D
orB.C
? Both areequally valid, and thus Python does not give you any means of telling one
from the other.
Indeed, even Python's
pickle
module has troubles pickling the objectx
:So, in general, I see no other option than adding an attribute
to all your classes (say,
_class_path
), and your serialization code would look it up forrecording the class name into the serialized format:
You can even do this automatically with some metaclass magic (but also read the other comments in the same SO post for caveats that may apply if you do the
D=B.C
above).That said, if you can limit your serialization code to (1) instances
of new-style classes, and (2) these classes are defined at the
top-level of a module, then you can just copy what
pickle
does(function
save_global
at lines 730--768 in pickle.py from Python2.6).
The idea is that every new-style class defines attributes
__name__
and
__module__
, which are strings that expand to the class name (asfound in the sources) and the module name (as found in
sys.modules
); by saving these you can later import the module andget an instance of the class:
你不能,以任何合理的、非疯狂的方式。我想您可以找到类名和模块,然后对于每个类名验证它是否存在于模块中,如果不存在,则以分层方式遍历模块中确实存在的所有类,直到找到它。
但由于没有理由有这样的类层次结构,所以这不是问题。 :-)
另外,我知道您现在不想在工作中听到这个,但是:
跨平台序列化是一个有趣的主题,但是使用这样的对象进行操作不太可能非常有用,因为目标系统必须安装完全相同的对象层次结构。因此,您必须有两个用两种不同语言编写的系统,并且它们完全相同。这几乎是不可能的,而且可能不值得这么麻烦。
例如,您将无法使用 Python 标准库中的任何对象,因为 Ruby 中不存在这些对象。最终结果是您必须创建自己的对象层次结构,最终仅使用字符串和数字等基本类型。在这种情况下,您的对象刚刚成为基本原语的包含,然后您也可以使用 JSON 或 XML 序列化所有内容。
You can't, in any reasonable non-crazy way. I guess you could find the class name and the module, and then for each class name verify that it exist in the module, and if not, go through all classes that does exist in the module in a hierarchical way until you find it.
But since there is no reason to ever have class hierarchy like that, it's a non-problem. :-)
Also, I know you don't want to hear this at this point in your work, but:
Cross-platform serialization is an interesting subject, but doing it with objects like this is unlikely to be very useful, as the target system must have the exact same object hierarchy installed. You must therefore have two systems written in two different languages that are exactly equivalent. That's almost impossible and likely to not be worth the trouble.
You would for example not be able to use any object from Pythons standard library, as those don't exist in Ruby. The end result is that you must make your own object hierarchy that in the end use only basic types like strings and numbers. And in that case, your objects have just become containment for basic primitives, and then you can just as well serialize everything with JSON or XML anyway.
不要。标准库已经包含一个。实际上,根据您的计算方式,它至少包括两个(
pickle
和shelve
)。Don't. The standard library already includes one. Depending on how you count, actually, it includes at least two (
pickle
andshelve
).有两种方法可以做到这一点。
解决方案 1
第一个解决方案通过垃圾收集器。
这是代码:
算法:
__module__
的类字典,解决方案 2
使用源文件。使用检查来获取类的行并向上扫描嵌套它的新类。
注意:我不知道 python 2 中没有干净的方法,但 python 3 提供了一种。
There are two ways of doing this.
Solution 1
The first one goes via the garbage-collector.
this is the code:
Algorithm:
__module__
as CSolution 2
use the source file. use inspect to get the lines of the class and scan upwards for new classes that nest it.
Note: I know no clean way in python 2, but python 3 provides one.