如何确定用于声明 PyObject 实例布局的结构体?

发布于 2024-12-20 17:37:47 字数 2920 浏览 0 评论 0原文

我正在用 C++ 编写 Python 3 扩展,并且正在尝试找到一种方法来检查 PyObject 是否与定义其实例布局的类型(结构)相关。我只对静态大小 PyObject 感兴趣,而不是 PyVarObject。实例布局由具有某些明确定义的布局的结构体定义:强制的 PyObject 标头和(可选)用户定义的成员。

下面是基于众所周知的 PyObject 扩展示例定义新类型中的 Noddy 示例

// Noddy struct specifies PyObject instance layout
struct Noddy {
    PyObject_HEAD
    int number;
};

// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
    PyObject_HEAD_INIT(NULL)
    0,                         /*ob_size*/
    "noddy.Noddy",             /*tp_name*/
    sizeof(Noddy),             /*tp_basicsize*/
    0,                         /*tp_itemsize*/
    ...
    Noddy_new,                 /* tp_new */
};

值得注意的是 Noddy 是一种类型,一个编译时实体, 但NoddyType是运行时内存中存在的对象。 NoddyNoddyType 之间唯一明显的关系似乎是 sizeof(Noddy) 的值存储在 tp_basicsize 成员中。

在 Python 中实现的手写继承指定了允许在 PyObject 和用于声明特定 PyObject 实例布局的类型之间进行转换的规则:

PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    // When a Python object is a Noddy instance,
    // its PyObject* pointer can be safely cast to Noddy
    Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));

    self->number = 0; // initialise Noddy members

    return reinterpret_cast<PyObject*>(self);
}

在各种槽函数等情况下,可以安全地假设“Python 对象是 Noddy”并在不进行任何检查的情况下进行强制转换。 然而,有时需要在其他情况下进行强制转换,那么感觉就像是盲目转换:

void foo(PyObject* obj)
{
    // How to perform safety checks?
    Noddy* noddy = reinterpret_cast<Noddy*>(obj);
    ...
}

可以检查 sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize,但它是解决方案不充分,因为:

1) 如果用户从 Noddy 派生

class BabyNoddy(Noddy):
    pass

,并且 foo 中的 obj 指向 BabyNoddy 的实例代码>, Py_TYPE(obj)->tp_basicsize 是不同的。 但是,转换为reinterpret_cast(obj)以获取指向实例布局部分的指针仍然是安全的。

2) 可以有其他结构体声明与 Noddy 大小相同的实例布局:

struct NeverSeenNoddy {
    PyObject_HEAD
    short word1;
    short word2;
};

事实上,C 语言级别的 NeverSeenNoddy 结构体与 NoddyType 兼容code> 类型对象 - 它可以适合 NoddyType。所以,演员阵容可能完全没问题。

所以,我的大问题是:

是否有任何 Python 策略可用于确定 PyObject 是否与 Noddy 实例布局兼容?

有什么方法可以检查 PyObject* 是否指向嵌入在 Noddy 中的对象部分?

如果没有政策,是否有可能进行黑客攻击?

编辑:有几个问题看起来很相似,但在我看来它们与我问的问题不同。例如: 访问 PyObject 的底层结构

编辑2:按顺序要了解为什么我将 Sven Marnach 的回复标记为答案,请参阅该答案下面的评论。

I'm writing Python 3 extensions in C++ and I'm trying to find a way to check if a PyObject is related to a type (struct) defining its instance layout. I'm only interested in static-size PyObject, not PyVarObject. The instance layout is defined by a struct with certain well-defined layout: mandatory PyObject header and (optional) user-defined members.

Below, is example of PyObject extension based on the well-known Noddy example in Defining New Types:

// Noddy struct specifies PyObject instance layout
struct Noddy {
    PyObject_HEAD
    int number;
};

// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
    PyObject_HEAD_INIT(NULL)
    0,                         /*ob_size*/
    "noddy.Noddy",             /*tp_name*/
    sizeof(Noddy),             /*tp_basicsize*/
    0,                         /*tp_itemsize*/
    ...
    Noddy_new,                 /* tp_new */
};

It is important to notice that the Noddy is a type, a compile-time entity,
but NoddyType is an object present in memory at run-time.
The only obvious relation between the Noddy and NoddyType seems to be
value of sizeof(Noddy) stored in tp_basicsize member.

The hand-written inheritance implemented in Python specifies rules which allow to cast between PyObject and type used to declare the instance layout of that particular PyObject:

PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    // When a Python object is a Noddy instance,
    // its PyObject* pointer can be safely cast to Noddy
    Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));

    self->number = 0; // initialise Noddy members

    return reinterpret_cast<PyObject*>(self);
}

In circumstances like various slot functions, it is safe to assume "a Python object is a Noddy" and cast without any checks.
However, sometimes it is necessary to cast in other situations, then it feels like a blind conversion:

void foo(PyObject* obj)
{
    // How to perform safety checks?
    Noddy* noddy = reinterpret_cast<Noddy*>(obj);
    ...
}

It is possible to check sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize, but it is insufficient solution due to:

1) If a user will derive from Noddy

class BabyNoddy(Noddy):
    pass

and obj in foo points to instance of the BabyNoddy, Py_TYPE(obj)->tp_basicsize is diferent.
But, it is still safe to cast to reinterpret_cast<Noddy*>(obj) to get pointer to the instance layout part.

2) There can be other struct declaring instance layout of the same size as Noddy:

struct NeverSeenNoddy {
    PyObject_HEAD
    short word1;
    short word2;
};

In fact, C langauge level, NeverSeenNoddy struct is compatible with the NoddyType type object - it can fit into NoddyType. So, cast could be perfectly fine.

So, my big question is this:

Is there any Python policy which could be used to determine if a PyObject is compatible with the Noddy instance layout?

Any way to check if PyObject* points to the object part which is embedded in the Noddy?

If not policy, is there any hack possible?

EDIT: There are a few questions which seem to be similar, but in my opinion they are different to the one I have asked. For example: Accessing the underlying struct of a PyObject

EDIT2: In order to understand why I marked Sven Marnach's response as the answer, see comments below that answer.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

不甘平庸 2024-12-27 17:37:47

在 Python 中,您可以使用测试 isinstance(obj, Noddy) 检查 obj 是否为 Noddy 类型或派生类型。 C-API 中测试某些 PyObject *obj 是否为 NoddyType 类型或派生类型基本相同,您使用 PyObject_IsInstance()

PyObject_IsInstance(obj, &NoddyType)

至于你的第二个问题,没有办法实现这一目标,并且如果你认为你需要这个,你的设计有严重的缺点。最好首先从 NoddyType 派生 NeverSeenNoddyType —— 那么上面的检查也会将派生类型的对象识别为 NoddyType 的实例

In Python, you can check if obj is of type Noddy or a derived type by using the test isinstance(obj, Noddy). The test in the C-API whether some PyObject *obj is of type NoddyType or a derived type is basically the same, you use PyObject_IsInstance():

PyObject_IsInstance(obj, &NoddyType)

As for your second question, there is no way to achieve this, and if you think you need this, your design has severe shortcomings. It would be better to derive NeverSeenNoddyType from NoddyType in the first place -- then the above check will also recognize an object of the derived type as an instance of NoddyType.

下雨或天晴 2024-12-27 17:37:47

由于每个对象都以 PyObject_HEAD 开头,因此访问此标头定义的字段始终是安全的。其中一个字段是 ob_type(通常使用 Py_TYPE 宏访问)。如果它指向 NoddyType 或从 NoddyType 派生的任何其他类型(这是 PyObject_IsInstance 告诉您的),那么您可以假设该对象的布局是struct Noddy 的结构。

换句话说,如果对象的 Py_TYPE 指向 NoddyType 或其任何子类,则该对象与 Noddy 实例布局兼容。

第二个问题,演员阵容不太好。 NoddyNeverSeenNoddy 的布局不同,尽管大小可能相同。

假设 NeverSeenNoddyNeverSeenNoddy_Type 类型的布局,如果 PyObject_IsInstance(obj, &NeverSeenNoddy_Type)<,则永远不应该转换为 NeverSeenNoddy /代码> 是假的。

如果您想要两个具有公共字段的 C 级类型,则应从实例布局中仅具有公共字段的公共基础派生这两种类型。

然后,子类型应在其布局顶部包含基本布局:

struct SubNoddy {
    // No PyObject_HEAD because it's already in Noddy
    Noddy noddy;
    int extra_field;
};

然后,如果 PyObject_IsInstance(obj, &SubNoddy_Type) 返回 true,您可以转换为 SubNoddy 并访问extra_field 字段。
如果 PyObject_IsInstance(obj, &Noddy_Type) 返回 true,您可以转换为 Noddy 并访问公共字段。

Becuase every object starts with PyObject_HEAD, it is always safe to access the fields defined by this header. One of the fields is ob_type (usually accessed using the Py_TYPE macro). If this points to NoddyType or any other type derived from NoddyType (which is what PyObject_IsInstance tells you), then you can assume the object's layout is that of struct Noddy.

In other words, an object is compatible with Noddy instance layout if its Py_TYPE points to NoddyType or any of its subclasses.

In the second question, the cast wouldn't be fine. The layouts of Noddy and NeverSeenNoddy are different, even though the size might be the same.

Assuming that NeverSeenNoddy is layout of a NeverSeenNoddy_Type type, you should never cast to NeverSeenNoddy if PyObject_IsInstance(obj, &NeverSeenNoddy_Type) is false.

If you want to have two C-level types with common fields, you should derive both types from common base that has only the common fields in the instance layout.

The subtypes should then include the base layout at the top of their layouts:

struct SubNoddy {
    // No PyObject_HEAD because it's already in Noddy
    Noddy noddy;
    int extra_field;
};

Then, if PyObject_IsInstance(obj, &SubNoddy_Type) returns true, you can cast to SubNoddy and access the extra_field field.
If PyObject_IsInstance(obj, &Noddy_Type) returns true, you can cast to Noddy and access the common fields.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文