当调用非虚拟基方法时，C++ 中的虚拟继承是否有任何惩罚/成本？

发布于 2024-10-30 19:58:10 字数 258 浏览 9 评论 0原文

当我们从基类调用常规函数成员时，在 C++ 中使用虚拟继承是否会在编译代码中产生运行时损失？示例代码：

class A {
    public:
        void foo(void) {}
};
class B : virtual public A {};
class C : virtual public A {};
class D : public B, public C {};

// ...

D bar;
bar.foo ();

原文

Does using virtual inheritance in C++ have a runtime penalty in compiled code, when we call a regular function member from its base class? Sample code:

class A {
    public:
        void foo(void) {}
};
class B : virtual public A {};
class C : virtual public A {};
class D : public B, public C {};

// ...

D bar;
bar.foo ();

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅语花开 2024-11-06 19:58:10

是的，如果您通过指针或引用调用成员函数，并且编译器无法绝对确定该指针或引用指向或引用的对象类型，则可能会出现这种情况。例如，考虑一下：

void f(B* p) { p->foo(); }

void g()
{
    D bar;
    f(&bar);
}

假设对 f 的调用不是内联的，编译器需要生成代码来查找 A 虚拟基类子对象的位置，以便调用 <代码>foo。通常此查找涉及检查 vptr/vtable。

如果编译器知道您调用函数的对象的类型（如您的示例中的情况），则应该没有开销，因为可以静态分派函数调用（在编译时）。在您的示例中， bar 的动态类型已知为 D （它不能是其他任何值），因此虚拟基类子对象 的偏移量A 可以在编译时计算。

There may be, yes, if you call the member function via a pointer or reference and the compiler can't determine with absolute certainty what type of object that pointer or reference points or refers to. For example, consider:

void f(B* p) { p->foo(); }

void g()
{
    D bar;
    f(&bar);
}

Assuming the call to f is not inlined, the compiler needs to generate code to find the location of the A virtual base class subobject in order to call foo. Usually this lookup involves checking the vptr/vtable.

If the compiler knows the type of the object on which you are calling the function, though (as is the case in your example), there should be no overhead because the function call can be dispatched statically (at compile time). In your example, the dynamic type of bar is known to be D (it can't be anything else), so the offset of the virtual base class subobject A can be computed at compile time.

回复收藏 0 原文

失而复得 2024-11-06 19:58:10

是的，虚拟继承有运行时性能开销。这是因为，对于任何指向对象的指针/引用，编译器在编译时无法找到它的子对象。相反，对于单继承，每个子对象都位于原始对象的静态偏移处。考虑一下：

class A { ... };
class B : public A { ... }

B 的内存布局看起来有点像这样：

| B's stuff | A's stuff |

在这种情况下，编译器知道 A 在哪里。然而，现在考虑 MVI 的情况。

class A { ... };
class B : public virtual A { ... };
class C : public virtual A { ... };
class D : public C, public B { ... };

B 的内存布局：

| B's stuff | A's stuff |

C 的内存布局：

| C's stuff | A's stuff |

但是等等！当 D 被实例化时，它看起来不是这样的。

| D's stuff | B's stuff | C's stuff | A's stuff |

现在，如果你有一个 B*，如果它确实指向 B，那么 A 就在 B- 旁边，但如果它指向 D，那么为了获得 A*，你真的需要跳过 C 子-object，并且由于任何给定的 B* 都可以在运行时动态指向 B 或 D，因此您需要动态更改指针。这至少意味着您必须生成代码以通过某种方式查找该值，而不是在编译时嵌入该值（这是单继承所发生的情况）。

Yes, virtual inheritance has a run-time performance overhead. This is because the compiler, for any pointer/reference to object, cannot find it's sub-objects at compile-time. In constrast, for single inheritance, each sub-object is located at a static offset of the original object. Consider:

class A { ... };
class B : public A { ... }

The memory layout of B looks a little like this:

| B's stuff | A's stuff |

In this case, the compiler knows where A is. However, now consider the case of MVI.

class A { ... };
class B : public virtual A { ... };
class C : public virtual A { ... };
class D : public C, public B { ... };

B's memory layout:

| B's stuff | A's stuff |

C's memory layout:

| C's stuff | A's stuff |

But wait! When D is instantiated, it doesn't look like that.

| D's stuff | B's stuff | C's stuff | A's stuff |

Now, if you have a B*, if it really points to a B, then A is right next to the B- but if it points to a D, then in order to obtain A* you really need to skip over the C sub-object, and since any given B* could point to a B or a D dynamically at run-time, then you will need to alter the pointer dynamically. This, at the minimum, means that you will have to produce code to find that value by some means, as opposed to having the value baked-in at compile-time, which is what occurs for single inheritance.

回复收藏 0 原文

执笏见 2024-11-06 19:58:10

至少在典型的实现中，虚拟继承对（至少是某些）数据成员的访问带来（小！）惩罚。特别是，您通常会得到额外的间接级别来访问您虚拟派生的对象的数据成员。出现这种情况是因为（至少在正常情况下）两个或多个单独的派生类不仅具有相同的基类，而且具有相同的基类对象。为了实现这一点，两个派生类都具有指向最远派生对象中相同偏移量的指针，并通过该指针访问这些数据成员。

尽管从技术上来说这不是由于虚拟继承造成的，但可能值得注意的是，一般来说，多重继承有一个单独的（同样，很小的）惩罚。在单继承的典型实现中，您在对象中的某个固定偏移处（通常是最开始的位置）有一个虚函数表指针。在多重继承的情况下，显然不能在相同的偏移量处有两个 vtable 指针，因此最终会得到许多 vtable 指针，每个指针在对象中都有单独的偏移量。

IOW，具有单继承的 vtable 指针通常只是 static_cast(object_address)，但具有多重继承时，您会得到 static_cast(object_address+offset)。

从技术上讲，两者是完全独立的——但是当然，虚拟继承几乎唯一的用途是与多重继承结合使用，所以无论如何它都是半相关的。

回复收藏 0 原文

梦与时光遇 2024-11-06 19:58:10

具体来说，在 Microsoft Visual C++ 中，指向成员的指针大小存在实际差异。
请参阅#pragmapointers_to_members。正如您在该清单中所看到的 - 最通用的方法是“虚拟继承”，它不同于多重继承，而多重继承又不同于单继承。

这意味着在存在虚拟继承的情况下，需要更多信息来解析指向成员的指针，并且如果仅通过 CPU 缓存中占用的数据量，就会对性能产生影响 - 尽管也可能在成员查找的长度或所需的跳转次数。

回复收藏 0 原文

李白 2024-11-06 19:58:10

我认为，虚拟继承没有运行时惩罚。 不要将虚拟继承与虚拟函数混淆。两者是不同的东西。

虚拟继承可确保 D 实例中只有一个子对象 A。所以我不认为它会单独产生运行时损失。

但是，可能会出现在编译时无法知道该子对象的情况，因此在这种情况下，虚拟继承会产生运行时损失。 詹姆斯在他的回答中描述了一个这样的案例。

回复收藏 0 原文

枫以 2024-11-06 19:58:10

您的问题主要集中在调用虚拟基类的常规函数，而不是虚拟基类的虚拟函数（示例中的A类）更有趣的情况）——但是，是的，可能会有成本。当然，一切都依赖于编译器。

当编译器编译 A::foo 时，它假定“this”指向 A 的数据成员在内存中驻留的开始位置。此时，编译器可能不知道类 A 将是任何其他类的虚拟基类。但它很乐意生成代码。

现在，当编译器编译 B 时，不会真正发生变化，因为虽然 A 是虚拟基类，但它仍然是单继承，并且在典型情况下，编译器将通过紧随其后放置类 A 的数据成员来布局类 B通过类 B 的数据成员 - 因此 B * 可以立即转换为 A * 而无需任何值更改，因此无需进行任何调整。编译器可以使用相同的“this”指针（即使它是 B * 类型）调用 A::foo 并且没有任何害处。

同样的情况也适用于类 C——它仍然是单一继承，并且典型的编译器将把 A 的数据成员紧跟在 C 的数据成员后面，这样 C * 就可以立即转换为 A * 而无需任何值更改。因此，编译器可以简单地使用相同的“this”指针调用 A::foo（即使它是 C* 类型），并且没有任何害处。

然而，类 D 的情况完全不同。类 D 的布局通常是类 A 的数据成员，然后是类 B 的数据成员，然后是类 C 的数据成员，然后是类 D 的数据成员。

使用典型的布局，D * 可以立即转换为 A *，因此 A::foo 不会受到任何影响——编译器可以调用为 A::foo 生成的相同例程，而无需对“this”进行任何更改一切都很好。

但是，如果编译器需要调用成员函数（例如 C::other_member_func），即使 C::other_member_func 是非虚拟的，情况也会发生变化。原因是，当编译器为 C::other_member_func 编写代码时，它假设“this”指针引用的数据布局是 A 的数据成员，紧接着是 C 的数据成员。但对于 D 的实例而言，情况并非如此。编译器可能需要重写并创建一个（非虚拟）D::other_member_func，只是为了处理类实例内存布局差异。

请注意，在使用多重继承时，这是一种不同但相似的情况，但在没有虚拟基的多重继承中，编译器可以通过简单地向“this”指针添加位移或修复来处理所有事情，以说明基类的位置“嵌入”派生类的实例中。但对于虚拟基，有时需要重写函数。这完全取决于所调用的（甚至非虚拟）成员函数访问哪些数据成员。

，编译器可能需要编写：

例如，如果类 C 定义了一个非虚拟成员函数 C::some_member_func，则当从 C（而不是 D）的实际实例调用时 C::some_member_func，这是在编译时确定的（因为some_member_func 不是虚函数）
C::some_member_func 当从类 D 的实际实例调用相同的成员函数时（在编译时确定）。（从技术上讲，这个例程是 D::some_member_func。尽管这个成员函数的定义是隐式的并且与 C::some_member_func 的源代码相同，但生成的目标代码可能略有不同。）

如果 C::some_member_func 的代码恰好使用了 A 类和 C 类中定义的成员变量。

Your question is focused mostly on calling regular functions of the virtual base, not the (far) more interesting case of virtual functions of the virtual base class (class A in your example)-- but yes, there can be a cost. Of course everything is compiler dependent.

When the compiler compiled A::foo, it assumed that "this" points to the start of where the data members for A resides in memory. At this time, the compiler might not know that class A will be a virtual base of any other class. But it happily generates the code.

Now, when the compiler compiles B, there won't really be a change because while A is a virtual base class, it is still single inheritance and in the typical case, the compiler will layout class B by placing class A's data members immediately followed by class B's data members-- so a B * can be immediately castable to a A * without any change in value, and hence, the no adjustments need to be made. The compiler can call A::foo using the same "this" pointer (even though it is of type B *) and there is no harm.

The same situation is for class C-- its still single inheritance, and the typical compiler will place A's data members immediately followed by C's data members so a C * can be immediately castable to an A * without any change in value. Thus, the compiler can simply call A::foo with the same "this" pointer (even though it is of type C*) and there is no harm.

However, the situation is totally different for class D. The layout of class D will typically be class A's data members, followed by class B's data members, followed by class C's data members, followed by class D's data members.

Using the typical layout, a D * can be immediately convertable to an A *, so there is no penalty for A::foo-- the compiler can call the same routine it generated for A::foo without any change to "this" and everything is fine.

However, the situation changes if the compiler needs to call a member function such as C::other_member_func, even if C::other_member_func is non-virtual. The reason is that when the compiler wrote the code for C::other_member_func, it assumed that the data layout referenced by the "this" pointer is A's data members immediately followed by C's data members. But that is not true for an instance of D. The compiler may need to rewrite and create a (non-virtual) D::other_member_func, just to take care of the class instance memory layout difference.

Note that this is a different but similar situation when using multiple inheritance, but in multiple inheritance without virtual bases, the compiler can take care of everything by simply adding a displacement or fixup to the "this" pointer to account for where a base class is "embedded" within an instance of a derived class. But with virtual bases, sometimes a function rewrite is needed. It all depends on what data members are accessed by the (even non-virtual) member function being called.

For example, if class C defined a non-virtual member function C::some_member_func, the compiler might need to write:

C::some_member_func when called from an actual instance of C (and not D), as determined at compile time (because some_member_func isn't a virtual function)
C::some_member_func when the same member function is called from an actual instance of class D, as determined at compile time. (Technically this routine is D::some_member_func. Even though the definition of this member function is implicit and identical to the source code of C::some_member_func, the generated object code may be slightly different.)

if the code for C::some_member_func happens to use member variables defined in both class A and class C.

回复收藏 0 原文

七色彩虹 2024-11-06 19:58:10

好吧，在许多好的答案解释之后，虽然查找虚拟基类在内存中的确切位置会导致性能损失，但还有一个后续问题：“这个损失可以减少吗？”幸运的是，有一个以（尚未提及）final 关键字形式存在的部分解决方案。特别是，从原始示例的类 D 到最内层基 A 的调用通常（几乎）不会受到惩罚，但仅在一般情况下，如果您最终化D。

为什么这是必要的，让我们看一下多级类层次结构：

class Base {};

class ExtA : public virtual Base {};
class ExtB : public virtual Base {};
class ExtC : public virtual Base {};

class App1 : public Base {};
class App2 : public ExtA {};
class App3 : public ExtB, public ExtC {};

class SuperApp : public App2, public App3 {};

因为我们的 App 应用类可以使用基类的各种 Ext ension 类，所以这些都不是 code>Extension 类可以在编译时知道 Base 子对象将位于调用它们的对象内的位置。相反，他们必须在运行时查阅虚拟表才能找到答案。这是因为各种 Ext 和 App 类都可以在不同的翻译单元中定义。

但是 Application 类也存在同样的问题：因为 App2 和 App3 通过 App2 继承了虚拟化的 Base code>Extension 类，它们在编译时不知道 Base 子对象位于它们自己的对象中的位置。因此，App2 或 App3 的每个方法都必须查阅虚拟表，以找到 Base 子对象在其本地对象中的位置。这是因为以后进一步组合这些 App 类在语法上是合法的，如上面层次结构中的 SuperApp 类所示。

另请注意，如果Base 类调用Extension 或App应用级别上定义的任何虚拟方法，则会产生进一步的惩罚。这是因为将使用指向 Base 对象的 this 来调用虚拟方法，但他们必须通过再次查询虚拟表来将 this 调整为自己对象的开头。如果Extension或Application层（虚拟或非虚拟）方法调用在Base类上定义的虚拟方法，则惩罚为发生两次：第一次是查找 Base 子对象，然后再次查找与 Base 子对象相对的真实对象。

但是，如果我们知道不会创建组合多个 App 的 SuperApp，我们可以通过声明 App 来改进很多事情类final：

class App1 final : public Base {};
class App2 final : public ExtA {};
class App3 final : public ExtB, public ExtC {};

// class SuperApp : public App2, public App3 {};   // illegal now!

由于final使布局不可变，因此Application类的方法不需要通过虚拟表来查找Base 不再是子对象了。当调用任何Base方法时，他们只是将已知的常量偏移量添加到this指针。应用层的虚拟回调可以通过减去一个已知的常量偏移量来轻松修复 this 指针（或者甚至根本不修复它并引用来自的各个字段）而是对象的中间）。 Base 类的方法本身也不会产生任何惩罚，因为在该类中，一切正常。因此，在最外层具有最终化类的三级场景中，如果需要引用字段，则只有扩展级别上的方法的执行速度会较慢或 Base 类的方法，或者它们实际上是从 Base 调用的。

final 关键字的缺点是它不允许所有扩展。您无法再从 App2 派生 App2a，即使它不需要任何这些 Ext 扩展也是如此。并声明一个非final App2Base，然后从中声明final App2a 和 App2b ，将再次对 App2Base 中引用原始 Base 的所有方法造成惩罚。不幸的是，C++ 诸神并没有给我们一种方法来仅取消基类的虚拟化，而让非虚拟扩展成为可能。他们也没有为我们提供一种方法来声明“主”Extension 类，该类的布局保持固定，即使其他 Extension 具有相同的虚拟 还添加了 Base 类（在这种情况下，所有非主 Ext 对象都将引用主 Ext 中的 Base 子对象代码> ension）。

像这样的虚拟继承的替代方法通常是将所有扩展内容添加到 Base 类中。根据应用程序，这可能需要大量额外且经常未使用的字段和/或大量额外的虚拟方法调用和/或大量dynamic_cast，这些都会带来性能损失，也。

另请注意，在现代 CPU 中，错误预测的虚拟函数调用后的惩罚远高于错误预测的 this 指针修复后的惩罚。第一个需要丢弃在错误执行路径上获得的所有结果，并在正确的路径上重新启动。后者仍然需要重复直接或间接依赖于 this 的所有操作码，但不需要再次加载和解码指令。顺便说一句：未知指针修复的推测执行是 CPU 容易遭受 Spectre/Meltdown 类型数据泄漏的原因之一。

Well, after many good answers explaining, while looking up the exact position of the virtual base class in memory incurs a performance penalty, there is a follow up question: "Can this penalty be reduced?" Fortunately, there is a partial solution in form of the (not yet mentioned) final keyword. In particular, calls from the class D of the original example to the innermost base A can usually be (almost) penalty-free, but in the general case only, if you finalize D.

For why this is necessary, let's look at a multilevel class hierarchy:

class Base {};

class ExtA : public virtual Base {};
class ExtB : public virtual Base {};
class ExtC : public virtual Base {};

class App1 : public Base {};
class App2 : public ExtA {};
class App3 : public ExtB, public ExtC {};

class SuperApp : public App2, public App3 {};

Because our Application classes can use various of the Extension classes of our base class, none of those Extension classes can know at compile time, where the Base subobject will be located within the object, that they are called with. Rather, they have to consult the virtual table at runtime to find out. This is, because the various Ext and App classes can all be defined in different translation units.

But the same problem exists for the Application classes: Because App2 and App3 inherit a virtualized Base via the Extension class(es), they don't know at compile time, where that Base subobject is located within their own objects. So each method of App2 or App3 has to consult the virtual table to find the location of the Base subobject within their local objects. This is, because it is syntactically legal to later combine those App classes further, as illustrated with the SuperApp class in the above hierarchy.

Also note, that there is a further penalty, if the Base class calls any virtual methods defined on the Extension or Application level. That's because the virtual method will be called with this pointing to a Base object, but they have to adjust this to the beginning of their own object by again consulting the virtual table. If an Extension or Application layer (virtual or non-virtual) method calls a virtual method defined on the Base class, that penalty is incurred twice: First for finding the Base subobject and then again for finding the real object relative from the Base subobject.

However, if we know, that a SuperApp combining several Apps won't be created, we can improve things a lot by declaring the App classes final:

class App1 final : public Base {};
class App2 final : public ExtA {};
class App3 final : public ExtB, public ExtC {};

// class SuperApp : public App2, public App3 {};   // illegal now!

Because final makes the layout immutable, methods of the Application classes don't need to go through a virtual table to find the Base subobject anymore. They just add the known constant offet to the this pointer, when calling any Base method. And virtual callbacks at the Application layer can fixup the this pointer easily again by subtracting a constant known offset (or even not fix it up at all and reference the various fields from the middle of the object instead). Methods of the Base class also don't incur any penalty upon themselves, because inside that class, everything works normal. So in this three-level scenario with finalized classes on the outmost level, only the execution of methods on the Extensions level is slower, if they need to refer to fields or methods of the Base class, or if they are virtually called from the Base.

The backdraw of the final keyword is, that it disallows all extensions. You cannot derive an App2a from App2 anymore, even, if it doesn't require any of those Extensions. And declaring a non-final App2Base and then final App2a and App2b from it, would again incur penalties for all the methods in App2Base, that refer to the original Base. Unfortunately, the C++ Gods didn't give us a way to just unvirtualize a base class, but leave non-virtual extensions possible. They also didn't give us a way to declare a "master" Extension class, whose layout stays fixed, even if other Extensions with the same virtual Base class are also added (in this case, all the non-master Extensions would refer to the Base subobject within the master Extension).

The alternative to virtual inheritance like this is usually to add all the extension stuff to the Base class. Depending on the application, that might require a lot of extra and often unused fields and/or a lot of extra virtual method calls and/or a lot of dynamic_casts, which all come with a performance penalty, too.

Also note, that in modern CPUs, the penalty after a mispredicted virtual function call is much higher than the penalty after a mispredicted this pointer fixup. The first needs to throw away all results obtained on the wrong execution path and restart afresh on the right path. The later still needs to repeat all opcodes depending directly or indirectly on this, but doesn't need to load and decode instructions again. BTW: The speculative execution with unknown pointer fixups is one of the reasons, why CPUs are vulnerable to Spectre/Meltdown type data leaks.

回复收藏 0 原文

許願樹丅啲祈禱 2024-11-06 19:58:10

虚拟继承必然有成本。

证据是，实际上继承的类所占的份额大于各个部分的总和。

典型案例：(

struct A{double a;};

struct B1 : virtual A{double b1;};
struct B2 : virtual A{double b2;};

struct C : virtual B1, virtual B2{double c;}; // I think these virtuals are not strictly necessary

static_assert( sizeof(A) == sizeof(double) ); // as expected

static_assert( sizeof(B1) > sizeof(A) + sizeof(double) ); // the equality holds for non-virtual inheritance
static_assert( sizeof(B2) > sizeof(A) + sizeof(double) );  // the equality holds for non-virtual inheritance

static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) );
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) + sizeof(double));

https://godbolt.org/z/zTcfoY)

额外存储了什么？我不太明白。
我认为它类似于虚拟表，但用于访问单个成员。

There has to be a cost to virtual-inheritance.

The proof is that virtually inherited classes occupy more than the sum of the parts.

Typical case:

struct A{double a;};

struct B1 : virtual A{double b1;};
struct B2 : virtual A{double b2;};

struct C : virtual B1, virtual B2{double c;}; // I think these virtuals are not strictly necessary

static_assert( sizeof(A) == sizeof(double) ); // as expected

static_assert( sizeof(B1) > sizeof(A) + sizeof(double) ); // the equality holds for non-virtual inheritance
static_assert( sizeof(B2) > sizeof(A) + sizeof(double) );  // the equality holds for non-virtual inheritance

static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) );
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) + sizeof(double));

(https://godbolt.org/z/zTcfoY)

What is stored additionally? I don't exactly understand.
I think it is something like a virtual table but for accessing individual members.

回复收藏 0 原文

原野 2024-11-06 19:58:10

额外的内存是有成本的。例如，x86-64 上的 GCC 7 给出以下结果：

#include <iostream>

class A { int a; };
class B: public A { int b; };
class C: public A { int c; };
class D: public B, public C { int d; };
class BV: virtual public A { int b; };
class CV: virtual public A { int c; };
class DV: public BV, public CV { int d; };


int main()
{
    std::cout << sizeof(A) << std::endl;
    std::cout << sizeof(B) << std::endl;
    std::cout << sizeof(C) << std::endl;
    std::cout << sizeof(D) << std::endl;
    std::cout << sizeof(BV) << std::endl;
    std::cout << sizeof(CV) << std::endl;
    std::cout << sizeof(DV) << std::endl;
    return 0;
}

打印结果：

如您所见，使用虚拟继承时添加了一些额外的字节。

There is a cost of additional memory. For example, GCC 7 on x86-64 gives following results:

#include <iostream>

class A { int a; };
class B: public A { int b; };
class C: public A { int c; };
class D: public B, public C { int d; };
class BV: virtual public A { int b; };
class CV: virtual public A { int c; };
class DV: public BV, public CV { int d; };


int main()
{
    std::cout << sizeof(A) << std::endl;
    std::cout << sizeof(B) << std::endl;
    std::cout << sizeof(C) << std::endl;
    std::cout << sizeof(D) << std::endl;
    std::cout << sizeof(BV) << std::endl;
    std::cout << sizeof(CV) << std::endl;
    std::cout << sizeof(DV) << std::endl;
    return 0;
}

This prints out: