如何确定用于声明 PyObject 实例布局的结构体?
我正在用 C++ 编写 Python 3 扩展,并且正在尝试找到一种方法来检查 PyObject
是否与定义其实例布局的类型(结构)相关。我只对静态大小 PyObject
感兴趣,而不是 PyVarObject
。实例布局由具有某些明确定义的布局的结构体定义:强制的 PyObject
标头和(可选)用户定义的成员。
下面是基于众所周知的 的 PyObject
扩展示例定义新类型中的 Noddy 示例:
// Noddy struct specifies PyObject instance layout
struct Noddy {
PyObject_HEAD
int number;
};
// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
...
Noddy_new, /* tp_new */
};
值得注意的是 Noddy
是一种类型,一个编译时实体, 但NoddyType是运行时内存中存在的对象。 Noddy
和 NoddyType
之间唯一明显的关系似乎是 sizeof(Noddy)
的值存储在 tp_basicsize
成员中。
在 Python 中实现的手写继承指定了允许在 PyObject
和用于声明特定 PyObject
实例布局的类型之间进行转换的规则:
PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
// When a Python object is a Noddy instance,
// its PyObject* pointer can be safely cast to Noddy
Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));
self->number = 0; // initialise Noddy members
return reinterpret_cast<PyObject*>(self);
}
在各种槽函数等情况下,可以安全地假设“Python 对象是 Noddy”并在不进行任何检查的情况下进行强制转换。 然而,有时需要在其他情况下进行强制转换,那么感觉就像是盲目转换:
void foo(PyObject* obj)
{
// How to perform safety checks?
Noddy* noddy = reinterpret_cast<Noddy*>(obj);
...
}
可以检查 sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize
,但它是解决方案不充分,因为:
1) 如果用户从 Noddy
派生
class BabyNoddy(Noddy):
pass
,并且 foo
中的 obj
指向 BabyNoddy
的实例代码>, Py_TYPE(obj)->tp_basicsize
是不同的。 但是,转换为reinterpret_cast
以获取指向实例布局部分的指针仍然是安全的。
2) 可以有其他结构体声明与 Noddy
大小相同的实例布局:
struct NeverSeenNoddy {
PyObject_HEAD
short word1;
short word2;
};
事实上,C 语言级别的 NeverSeenNoddy
结构体与 NoddyType
兼容code> 类型对象 - 它可以适合 NoddyType
。所以,演员阵容可能完全没问题。
所以,我的大问题是:
是否有任何 Python 策略可用于确定 PyObject
是否与 Noddy
实例布局兼容?
有什么方法可以检查 PyObject*
是否指向嵌入在 Noddy
中的对象部分?
如果没有政策,是否有可能进行黑客攻击?
编辑:有几个问题看起来很相似,但在我看来它们与我问的问题不同。例如: 访问 PyObject 的底层结构
编辑2:按顺序要了解为什么我将 Sven Marnach 的回复标记为答案,请参阅该答案下面的评论。
I'm writing Python 3 extensions in C++ and I'm trying to find a way to check if a PyObject
is related to a type (struct) defining its instance layout. I'm only interested in static-size PyObject
, not PyVarObject
. The instance layout is defined by a struct with certain well-defined layout: mandatory PyObject
header and (optional) user-defined members.
Below, is example of PyObject
extension based on the well-known Noddy example in Defining New Types:
// Noddy struct specifies PyObject instance layout
struct Noddy {
PyObject_HEAD
int number;
};
// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
...
Noddy_new, /* tp_new */
};
It is important to notice that the Noddy
is a type, a compile-time entity,
but NoddyType
is an object present in memory at run-time.
The only obvious relation between the Noddy
and NoddyType
seems to be
value of sizeof(Noddy)
stored in tp_basicsize
member.
The hand-written inheritance implemented in Python specifies rules which allow to cast between PyObject
and type used to declare the instance layout of that particular PyObject
:
PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
// When a Python object is a Noddy instance,
// its PyObject* pointer can be safely cast to Noddy
Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));
self->number = 0; // initialise Noddy members
return reinterpret_cast<PyObject*>(self);
}
In circumstances like various slot functions, it is safe to assume "a Python object is a Noddy" and cast without any checks.
However, sometimes it is necessary to cast in other situations, then it feels like a blind conversion:
void foo(PyObject* obj)
{
// How to perform safety checks?
Noddy* noddy = reinterpret_cast<Noddy*>(obj);
...
}
It is possible to check sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize
, but it is insufficient solution due to:
1) If a user will derive from Noddy
class BabyNoddy(Noddy):
pass
and obj
in foo
points to instance of the BabyNoddy
, Py_TYPE(obj)->tp_basicsize
is diferent.
But, it is still safe to cast to reinterpret_cast<Noddy*>(obj)
to get pointer to the instance layout part.
2) There can be other struct declaring instance layout of the same size as Noddy
:
struct NeverSeenNoddy {
PyObject_HEAD
short word1;
short word2;
};
In fact, C langauge level, NeverSeenNoddy
struct is compatible with the NoddyType
type object - it can fit into NoddyType
. So, cast could be perfectly fine.
So, my big question is this:
Is there any Python policy which could be used to determine if a PyObject
is compatible with the Noddy
instance layout?
Any way to check if PyObject*
points to the object part which is embedded in the Noddy
?
If not policy, is there any hack possible?
EDIT: There are a few questions which seem to be similar, but in my opinion they are different to the one I have asked. For example: Accessing the underlying struct of a PyObject
EDIT2: In order to understand why I marked Sven Marnach's response as the answer, see comments below that answer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 Python 中,您可以使用测试
isinstance(obj, Noddy)
检查obj
是否为Noddy
类型或派生类型。 C-API 中测试某些PyObject *obj
是否为NoddyType
类型或派生类型基本相同,您使用PyObject_IsInstance()
:至于你的第二个问题,没有办法实现这一目标,并且如果你认为你需要这个,你的设计有严重的缺点。最好首先从
NoddyType
派生NeverSeenNoddyType
—— 那么上面的检查也会将派生类型的对象识别为NoddyType 的实例
。In Python, you can check if
obj
is of typeNoddy
or a derived type by using the testisinstance(obj, Noddy)
. The test in the C-API whether somePyObject *obj
is of typeNoddyType
or a derived type is basically the same, you usePyObject_IsInstance()
:As for your second question, there is no way to achieve this, and if you think you need this, your design has severe shortcomings. It would be better to derive
NeverSeenNoddyType
fromNoddyType
in the first place -- then the above check will also recognize an object of the derived type as an instance ofNoddyType
.由于每个对象都以
PyObject_HEAD
开头,因此访问此标头定义的字段始终是安全的。其中一个字段是ob_type
(通常使用Py_TYPE
宏访问)。如果它指向NoddyType
或从NoddyType
派生的任何其他类型(这是PyObject_IsInstance
告诉您的),那么您可以假设该对象的布局是struct Noddy
的结构。换句话说,如果对象的
Py_TYPE
指向NoddyType
或其任何子类,则该对象与Noddy
实例布局兼容。第二个问题,演员阵容不太好。
Noddy
和NeverSeenNoddy
的布局不同,尽管大小可能相同。假设
NeverSeenNoddy
是NeverSeenNoddy_Type
类型的布局,如果PyObject_IsInstance(obj, &NeverSeenNoddy_Type)<,则永远不应该转换为
NeverSeenNoddy
/代码> 是假的。如果您想要两个具有公共字段的 C 级类型,则应从实例布局中仅具有公共字段的公共基础派生这两种类型。
然后,子类型应在其布局顶部包含基本布局:
然后,如果
PyObject_IsInstance(obj, &SubNoddy_Type)
返回 true,您可以转换为SubNoddy
并访问extra_field
字段。如果
PyObject_IsInstance(obj, &Noddy_Type)
返回 true,您可以转换为Noddy
并访问公共字段。Becuase every object starts with
PyObject_HEAD
, it is always safe to access the fields defined by this header. One of the fields isob_type
(usually accessed using thePy_TYPE
macro). If this points toNoddyType
or any other type derived fromNoddyType
(which is whatPyObject_IsInstance
tells you), then you can assume the object's layout is that ofstruct Noddy
.In other words, an object is compatible with
Noddy
instance layout if itsPy_TYPE
points toNoddyType
or any of its subclasses.In the second question, the cast wouldn't be fine. The layouts of
Noddy
andNeverSeenNoddy
are different, even though the size might be the same.Assuming that
NeverSeenNoddy
is layout of aNeverSeenNoddy_Type
type, you should never cast toNeverSeenNoddy
ifPyObject_IsInstance(obj, &NeverSeenNoddy_Type)
is false.If you want to have two C-level types with common fields, you should derive both types from common base that has only the common fields in the instance layout.
The subtypes should then include the base layout at the top of their layouts:
Then, if
PyObject_IsInstance(obj, &SubNoddy_Type)
returns true, you can cast toSubNoddy
and access theextra_field
field.If
PyObject_IsInstance(obj, &Noddy_Type)
returns true, you can cast toNoddy
and access the common fields.