如何强制编译器生成的类的复制构造函数*不*被编译器内联?
替代问题标题为: 如何显式地让编译器在特定的翻译单元中为编译器生成的构造函数生成代码?
我们面临的问题是,对于一个代码路径,结果 - 彻底测量 -- 如果一个对象的复制构造函数调用不是内联的,即如果此构造函数是手动实现的,则性能会更好(大约 5%)。 (我们注意到这一点是因为在代码清理期间,删除了该类的多余显式实现的复制构造函数(17 个成员)。)
编辑:请注意,我们已经检查了生成的程序集代码,并确保内联和代码生成按照我对两个不同代码版本的描述进行。
我们现在面临着选择,是直接将手动复制因子代码放回原处(它的作用与编译器生成的代码完全相同),还是寻找任何其他方法来内联此类的复制因子。
是否有任何方法(对于 Microsoft Visual C++)可以在特定的翻译单元中显式实例化编译器生成的类函数,或者它们总是内联在使用它们的每个翻译单元中? (也欢迎对 gcc 或其他编译器进行评论,以便更好地了解情况。)
由于前两个答案显示了一些误解:编译器生成类函数仅由编译器本身生成,如果它们既不是由用户声明也不是由用户定义。因此,任何修饰符都不能应用于它们,因为这些函数在源代码中不存在。
struct A {
std::string member;
};
A
有一个默认的复制构造函数、一个 dtor 和一个复制操作符。这些函数都不能通过某些 declspec 进行修改,因为它们在代码中不存在。
struct B {
std::string member;
B(B const& rhs);
};
B
现在有一个用户提供的复制因子,用户必须实现它。编译器不会为其生成代码。
为怀疑者提供更多背景知识:-) ...
此代码是使用 MS Visual C++ 编译的,但它链接到嵌入式(类似)(实时)系统。性能是通过在这个系统上计时来衡量的,因此我认为计时的人会得到一些不错的数字。
该测试是通过比较两个代码版本来执行的,其中唯一的区别是该类的内联和非内联复制构造函数。内联代码的计时较差约 5%。
进一步检查表明我在一点上犯了错误:编译器将为复杂的复制构造函数生成单独的函数。它将自行决定执行此操作,并且还取决于优化设置。因此,在我们的例子中,编译器在我们的特定情况下做了错误的事情。从到目前为止的答案来看,我们似乎无法告诉编译器否则。 :-(
Alternate question title would be:
How to explicitly have the compiler generate code for the compiler-generated constructors in a specific translation unit?
The problem we face is that for one code path the resulting -- thoroughly measured -- performance is better (by about 5%) if the copy-ctor calls of one object are not inlined, that is if this constructor is implemented manually. (We noticed this because during code-cleanup the superfluous explicitly implemented copy ctor of this class (17 members) was removed.)
Edit: Note that we have checked the generated assembly code and have made sure that the inlining and code generation is happening as I describe for the two different code versions.
We face now the choice of just dropping the manual copy-ctor code back in (it does exactly the same as the compiler generated one) or finding any other means of not inlining the copy ctor of this class.
Is there any means (for Microsoft Visual C++) to explicitly instantiate the compiler generated class functions in a specific translation unit or will they always be inlined in each translation unit where they are used? (Comments for gcc or other compilers are also welcome to get a better picture of the situation.)
Since the first 2 answers show some misunderstanding: The compiler generated class functions are only generated by the compiler itself if they are neither declared nor defined by the user. Therefore no modifiers whatsoever can be applied to them, since these function do not exist in the sourcecode.
struct A {
std::string member;
};
A
has a default and copy ctor, a dtor and a copy operator. Neither of these function can be modified via some declspec because they do not exist in the code.
struct B {
std::string member;
B(B const& rhs);
};
B
now has a user supplied copy ctor and the user has to implement it. The compiler will not generate code for it.
Some more background for the doubters :-) ...
This code is compiled using MS Visual C++, but it is linked for an embedded(-like) (realtime) system. Performance was measured by taking timings on this system and I therefore think the guys who took the timings will have some decent numbers.
The test was performed by comparing two code versions where the only difference was the inline vs. the not-inline copy ctor of this one class. Timings with the inlined code were worse by about 5%.
Further checking has revealed that I was mistaken in one point: The compiler will generate separate functions for complex copy constructors. It will do this on its own discretion and it also depends on the optimization settings. So in our case the compiler is doing the wrong thing in our specific circumstances. From the answers so far it does not appear we can tell the compiler otherwise. :-(
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
所以我们无能为力。隐式构造函数必须是内联的。这方面的任何其他行为可能都是扩展话
虽如此,
很可能您的手动复制构造函数(例如,如果您的类中的成员之一(共 17 个)是指针成员,则手动复制构造函数可能会处理深层复制(并且 。
因此,除非您仔细检查手动复制构造函数,否则不要考虑删除它并依赖(可能有错误的)隐式复制构造函数(在您的上下文中)
So there is nothing much we can do. The implcit constructor has to be an inline. Any other behavior in this regards would probably be an extension
Having said that,
It is likely that your manual copy constructor (which you removed during code cleanup) was doing the right thing. As an example, if one of the members (out of 17) in your class is a pointer member, it is likely that the manual copy constructor took care of deep copy(and hence took a performance hit).
So, unless you carefully review your manual copy constructor, don't even think of removing it and relying on the (potentially buggy) implicit copy constructor (in your context)
我非常怀疑内联与它有什么关系。如果编译器内联编译器生成的复制构造函数,为什么它不也内联显式定义的复制构造函数呢? (编译器的优化启发式失败如此严重以至于使内联代码慢了 5% 也是不寻常的)
在得出结论之前,
如果是这样,您能否用此信息更新您的问题?
C++ 中无法指示编译器生成的函数是否应该内联。即使是特定于供应商的扩展(例如
__declspec(noinline)
)也无法帮助您,因为您明确地将函数的所有责任移交给了编译器。因此编译器选择如何处理它、如何实现它以及是否内联它。你不能既说“请为我实现这个功能”,又同时说“请让我控制该功能的实现方式”。如果您想控制该功能,则必须实现它。 ;)在 C++0x 中,它可能是可能的(取决于这些特定于供应商的扩展如何与声明为
= default
的函数交互)。但同样,我不相信内联是问题所在。最有可能的是,这两个函数只是导致生成不同的汇编代码。
I highly doubt inlining has anything to do with it. If the compiler inlines the compiler-generated copy ctor, why wouldn't it also inline the explicitly defined one? (It is also unusual that the compiler's optimization heuristics fail so badly as to make inlined code 5% slower)
Before jumping to conclusions,
If that is the case, could you update your question with this information?
There is no way in C++ to indicate if a compiler-generated function should or shouldn't be inlined. Not even vendor-specific extensions such as
__declspec(noinline)
will help you there, since you're explicitly handing over all responsibility for the function to the compiler. So the compiler chooses what to do with it, how to implement it and whether or not to inline it. You can't both say "please implement this function for me", and at the same time "please let me control how the function is implemented". If you want control over the function, you have to implement it. ;)In C++0x, it may be possible (depending on how these vendor-specific extensions interact with functions declared as
= default
).But again, I'm not convinced that inlining is the issue. Most likely, the two functions just result in different assembly code being generated.
__declspec(noinline)。
文档说它仅适用于成员函数,但实际上它也适用于自由函数。
__declspec(noinline).
The documentation says that it applies only to member functions, but in fact it works with free functions as well.
通常最好将其隔离为您知道有问题的几个核心类型。示例a:
或者您可以免费拿走一些。示例b:
it's often best to isolate it to a few core types which you know are problematic. example a:
or you can take some of it for free. example b:
您可以使用某种嵌套对象。通过这种方式,嵌套对象的复制构造函数可以保留为免维护默认值,但您仍然有一个显式创建的复制构造函数,可以将其声明为 noinline。
如果您绝望,您可以在 .lib 中单独编译有问题的类并链接到它。将其更改为不同的翻译单元不会阻止 VC++ 内联它。另外,我不得不质疑他们是否真的在做同样的事情。如果手动复制构造函数与默认复制构造函数的作用相同,为什么还要实现它?
You could use some sort of nested object. In this way the nested object's copy constructor can be left as the maintenance-free default, but you still have an explicitly created copy constructor that you can declare noinline.
If you're desperate, you could compile the class in question separately in a .lib and link to it. Changing it to a different translation unit will not stop VC++ from inlining it. Also, I have to question as to whether they're actually doing the same thing. Why did you implement a manual copy constructor if it does the same as the default copy constructor?
添加我自己的结论并回答确切的问题而不详细说明:
你不能强制编译器,特别是VC++,内联或不内联编译器 -生成ctor/dtor/等。 -- 但是
优化器将自行选择是否内联编译器生成的函数(ctor)的代码,或者是否为此代码生成“真实”函数。 AFAIK 在这方面没有办法影响优化器的决定。
To add my own conclusion and to answer the exact question without going into details:
You cannot force the compiler, specifically VC++, to inline or not inline a compiler-generated ctor/dtor/etc. -- but
The optimizer will choose - on its discretion - if it inlines the code for a compiler generated function (ctor) or if it generates a "real" function for this code. AFAIK there is no way to influence the decision of the optimizer in this regard.