C++11 中的 POD 和继承。 struct的地址==第一个成员的地址吗？

发布于 2024-12-27 01:42:20 字数 1471 浏览 4 评论 0原文

（我编辑了这个问题以避免分心。在任何其他问题有意义之前，需要先解决一个核心问题。向现在答案似乎不太相关的任何人致歉。）

我们设置一个具体的例子：

struct Base {
    int i;
};

让没有虚方法，也没有继承，通常是一个非常愚蠢和简单的对象。因此它是普通旧数据(POD)并且它依赖于可预测的布局。特别是：

Base b;
&b == reinterpret_cast<B*>&(b.i);

这是根据维基百科 (它本身声称引用了 C++03 标准）：

指向 POD 结构对象的指针（使用重新解释强制转换进行适当转换）指向其初始成员，反之亦然，这意味着 POD 结构的开头没有填充。[8]

现在让我们考虑一下继承：

struct Derived : public Base {
};

同样，没有虚拟方法，没有虚拟继承，也没有多重继承。因此这也是 POD。

问题：这个事实（Derived 是 C++11 中的 POD）是否允许我们这样说：

Derived d;
&d == reinterpret_cast<D*>&(d.i); // true on g++-4.6

如果这是真的，那么以下内容将得到明确定义：

Base *b = reinterpret_cast<Base*>(malloc(sizeof(Derived)));
free(b); // It will be freeing the same address, so this is OK

我在这里不是询问new和delete - 它更容易考虑malloc 和 free。我只是好奇在这样的简单情况下有关派生对象布局的规定，以及基类的初始非静态成员位于可预测的位置。

派生对象是否应该等同于：

struct Derived { // no inheritance
    Base b; // it just contains it instead
};

事先没有填充？

原文

(I've edited this question to avoid distractions. There is one core question which would need to be cleared up before any other question would make sense. Apologies to anybody whose answer now seems less relevant.)

Let's set up a specific example:

struct Base {
    int i;
};

There are no virtual method, and there is no inheritance, and is generally a very dumb and simple object. Hence it's Plain Old Data (POD) and it falls back on a predictable layout. In particular:

Base b;
&b == reinterpret_cast<B*>&(b.i);

This is according to Wikipedia (which itself claims to reference the C++03 standard):

A pointer to a POD-struct object, suitably converted using a reinterpret cast, points to its initial member and vice versa, implying that there is no padding at the beginning of a POD-struct.[8]

Now let's consider inheritance:

struct Derived : public Base {
};

Again, there are no virtual methods, no virtual inheritance, and no multiple inheritance. Therefore this is POD also.

Question: Does this fact (Derived is POD in C++11) allow us to say that:

Derived d;
&d == reinterpret_cast<D*>&(d.i); // true on g++-4.6

If this is true, then the following would be well-defined:

Base *b = reinterpret_cast<Base*>(malloc(sizeof(Derived)));
free(b); // It will be freeing the same address, so this is OK

I'm not asking about new and delete here - it's easier to consider malloc and free. I'm just curious about the regulations about the layout of derived objects in simple cases like this, and where the initial non-static member of the base class is in a predictable location.

Is a Derived object supposed to be equivalent to:

struct Derived { // no inheritance
    Base b; // it just contains it instead
};

with no padding beforehand?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

陌上芳菲 2025-01-03 01:42:20

您不关心 POD 性，您关心标准布局。以下是来自标准第 9 节 [class] 的定义：

标准布局类是这样的类：
没有非标准布局类（或此类类型的数组）类型的非静态数据成员或引用，
没有虚函数 (10.3) 和虚基类 (10.1)，
对所有非静态数据成员具有相同的访问控制（第 11 条），
没有非标准布局基类，
要么在最底层的派生类中没有非静态数据成员，并且最多有一个具有非静态数据成员的基类，要么没有具有非静态数据成员的基类，并且
没有与第一个非静态数据成员相同类型的基类。

然后你想要的属性就得到了保证（第 9.2 节 [class.mem]）：

指向标准布局结构对象的指针，使用reinterpret_cast进行适当转换，指向其初始成员（或者如果该成员是位字段，则指向它所在的单元），反之亦然。

这实际上比旧的要求更好，因为添加重要的构造函数和/或析构函数不会丢失reinterpret_cast的能力。

现在让我们转向你的第二个问题。答案不是你所希望的。

Base *b = new Derived;
delete b;

除非 Base 有虚拟析构函数，否则这是未定义的行为。请参阅第 5.3.5 节 ([expr.delete])

在第一种选择（删除对象）中，如果要删除的对象的静态类型与其动态类型不同，则静态类型应是要删除的对象的动态类型的基类，而静态类型应是要删除的对象的动态类型的基类。类型应具有虚拟析构函数，否则行为未定义。

您之前使用 malloc 和 free 的代码片段大部分是正确的。这将起作用：

Base *b = new (malloc(sizeof(Derived))) Derived;
free(b);

因为指针 b 的值与从放置 new 返回的地址相同，而后者又与从 malloc 返回的地址相同。

You don't care about POD-ness, you care about standard-layout. Here's the definition, from the standard section 9 [class]:

A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.

And the property you want is then guaranteed (section 9.2 [class.mem]):

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

This is actually better than the old requirement, because the ability to reinterpret_cast isn't lost by adding non-trivial constructors and/or destructor.

Now let's move to your second question. The answer is not what you were hoping for.

Base *b = new Derived;
delete b;

is undefined behavior unless Base has a virtual destructor. See section 5.3.5 ([expr.delete])

In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined.

Your earlier snippet using malloc and free is mostly correct. This will work:

Base *b = new (malloc(sizeof(Derived))) Derived;
free(b);

because the value of pointer b is the same as the address returned from placement new, which is in turn the same address returned from malloc.

回复收藏 0 原文

何其悲哀 2025-01-03 01:42:20

想必你的最后一段代码的意思是：

Base *b = new Derived;
delete b;  // delete b, not d.

在这种情况下，简短的答案是它仍然是未定义的行为。事实上，所讨论的类或结构是 POD、标准布局或可简单复制，这一事实并没有真正改变任何东西。

是的，您正在传递正确的地址，是的，您和我都知道在这种情况下 dtor 几乎是一个 nop ——尽管如此，您传递给 delete 的指针有不同的静态类型比动态类型好，并且静态类型没有虚拟dtor。该标准非常清楚，这会产生未定义的行为。

从实际的角度来看，如果您真的坚持的话，您可能可以摆脱 UB - 很有可能您正在做的事情不会产生任何有害的副作用，至少对于大多数典型的编译器来说是这样。但请注意，即使最好的情况下，代码也非常脆弱，因此看似微不足道的更改可能会破坏一切 - 甚至切换到具有非常繁重的类型检查的编译器，这样可能也会这样做。

就你的论点而言，情况非常简单：这基本上意味着委员会如果愿意的话可能可以做出这种定义的行为。然而，据我所知，它从未被提出过，即使它被提出，它也可能是一个非常低优先级的项目——它并没有真正增加太多，启用新的编程风格等。

Presumably your last bit of code is intended to say:

Base *b = new Derived;
delete b;  // delete b, not d.

In that case, the short answer is that it remains undefined behavior. The fact that the class or struct in question is POD, standard layout or trivially copyable doesn't really change anything.

Yes, you're passing the right address, and yes, you and I know that in this case the dtor is pretty much a nop -- nonetheless, the pointer you're passing to delete has a different static type than dynamic type, and the static type does not have a virtual dtor. The standard is quite clear that this gives undefined behavior.

From a practical viewpoint, you can probably get away with the UB if you really insist -- chances are pretty good that there won't be any harmful side effects from what you're doing, at least with most typical compilers. Beware, however, that even at best the code is extremely fragile so seemingly trivial changes could break everything -- and even switching to a compiler with really heavy type checking and such could do so as well.

As far as your argument goes, the situation's pretty simple: it basically means the committee probably could make this defined behavior if they wanted to. As far as I know, however, it's never been proposed, and even if it had it would probably be a very low priority item -- it doesn't really add much, enable new styles of programming, etc.

回复收藏 0 原文

半衾梦 2025-01-03 01:42:20

这是对 Ben Voigt 的答案'的补充，而不是替代。

您可能认为这只是一个技术问题。称其为“未定义”的标准只是一些语义废话，除了允许编译器编写者无缘无故地做愚蠢的事情之外，没有任何实际效果。但事实并非如此。

我可以看到理想的实现，其中：

Base *b = new Derived;
delete b;

导致行为非常奇怪。这是因为当编译器静态地知道分配的内存块的大小时，存储它是有点愚蠢的。例如：

struct Base {
};

struct Derived {
   int an_int;
};

在这种情况下，当调用 delete Base 时，编译器有充分的理由（因为您在问题开头引用的规则）相信所指向的数据的大小是1，而不是 4。例如，如果它实现了一个版本的operator new，该版本具有一个单独的数组，其中全部都是密集包装的 1 字节实体，以及一个不同的数组，其中全部是 4 字节实体密集，最终会假设Base * 指向 1 字节实体数组中的某个位置，而实际上它指向 4 字节实体数组中的某个位置，并因此产生各种有趣的错误。

我真的希望operatordelete也被定义为也具有一个大小，并且如果operatordelete在一个具有非-的对象上调用，编译器会传入静态已知的大小。虚拟析构函数，或者如果由于虚拟析构函数而调用实际对象，则该实际对象的已知大小。尽管这可能会产生其他不良影响，并且可能不是一个好主意（例如，如果在没有调用析构函数的情况下调用operator delete）。但这会让问题变得非常明显。

This is meant as a supplement to Ben Voigt's answer', not a replacement.

You might think that this is all just a technicality. That the standard calling it 'undefined' is just a bit of semantic twaddle that has no real-world effects beyond allowing compiler writers to do silly things for no good reason. But this is not the case.

I could see desirable implementations in which:

Base *b = new Derived;
delete b;

Resulted in behavior that was quite bizarre. This is because storing the size of your allocated chunk of memory when it is known statically by the compiler is kind of silly. For example:

struct Base {
};

struct Derived {
   int an_int;
};

In this case, when delete Base is called, the compiler has every reason (because of the rule you quoted at the beginning of your question) to believe that the size of the data pointed at is 1, not 4. If it, for example, implements a version of operator new that has a separate array in which 1 byte entities are all densely packed, and a different array in which 4 byte entities are all densely packed, it will end up assuming the Base * points to somewhere in the 1-byte entity array when in fact it points somewhere in the 4-byte entity array, and making all kinds of interesting errors for this reason.

I really wish operator delete had been defined to also take a size, and the compiler passed in either the statically known size if operator delete was called on an object with a non-virtual destructor, or the known size of the actual object being pointed at if it were being called as a result of a virtual destructor. Though this would likely have other ill effects and maybe isn't such a good idea (like if there are cases in which operator delete is called without a destructor having been called). But it would make the problem painfully obvious.

回复收藏 0 原文

蓝海似她心 2025-01-03 01:42:20

上面有很多不相关问题的讨论。是的，主要是为了 C 兼容性，只要您知道自己在做什么，就有许多可以信赖的保证。然而，所有这些都与你的主要问题无关。主要问题是：是否存在可以使用与对象的动态类型不匹配的指针类型删除对象并且指向的类型没有虚拟析构函数的情况。答案是：不，没有。

其逻辑可以从运行时系统应该执行的操作中得出：它获取指向对象的指针并被要求删除它。如果要定义的话，它需要存储有关如何调用派生类析构函数或有关对象实际占用的内存量的信息。然而，这意味着在使用的内存方面可能会产生相当大的成本。例如，如果第一个成员需要非常严格的对齐，例如要在 8 字节边界对齐（如 double 的情况），则添加大小将增加至少 8 字节的开销来分配内存。尽管这听起来可能不太糟糕，但这可能意味着只有一个对象而不是两个或四个对象适合缓存行，从而大大降低性能。

回复收藏 0 原文

~没有更多了~