如何强制编译器生成的类的复制构造函数不被编译器内联？

发布于 2024-09-25 23:23:18 字数 1258 浏览 7 评论 0原文

替代问题标题为： 如何显式地让编译器在特定的翻译单元中为编译器生成的构造函数生成代码？

我们面临的问题是，对于一个代码路径，结果 - 彻底测量 -- 如果一个对象的复制构造函数调用不是内联的，即如果此构造函数是手动实现的，则性能会更好（大约 5%）。（我们注意到这一点是因为在代码清理期间，删除了该类的多余显式实现的复制构造函数（17 个成员）。）

编辑：请注意，我们已经检查了生成的程序集代码，并确保内联和代码生成按照我对两个不同代码版本的描述进行。

我们现在面临着选择，是直接将手动复制因子代码放回原处（它的作用与编译器生成的代码完全相同），还是寻找任何其他方法来内联此类的复制因子。

是否有任何方法（对于 Microsoft Visual C++）可以在特定的翻译单元中显式实例化编译器生成的类函数，或者它们总是内联在使用它们的每个翻译单元中？（也欢迎对 gcc 或其他编译器进行评论，以便更好地了解情况。）

由于前两个答案显示了一些误解：编译器生成类函数仅由编译器本身生成，如果它们既不是由用户声明也不是由用户定义。因此，任何修饰符都不能应用于它们，因为这些函数在源代码中不存在。

struct A {
  std::string member;
};

A 有一个默认的复制构造函数、一个 dtor 和一个复制操作符。这些函数都不能通过某些 declspec 进行修改，因为它们在代码中不存在。

struct B {
  std::string member;
  B(B const& rhs);
};

B 现在有一个用户提供的复制因子，用户必须实现它。编译器不会为其生成代码。

为怀疑者提供更多背景知识:-) ...

此代码是使用 MS Visual C++ 编译的，但它链接到嵌入式（类似）（实时）系统。性能是通过在这个系统上计时来衡量的，因此我认为计时的人会得到一些不错的数字。

该测试是通过比较两个代码版本来执行的，其中唯一的区别是该类的内联和非内联复制构造函数。内联代码的计时较差约 5%。

进一步检查表明我在一点上犯了错误：编译器将为复杂的复制构造函数生成单独的函数。它将自行决定执行此操作，并且还取决于优化设置。因此，在我们的例子中，编译器在我们的特定情况下做了错误的事情。从到目前为止的答案来看，我们似乎无法告诉编译器否则。 :-(

原文

Alternate question title would be:
How to explicitly have the compiler generate code for the compiler-generated constructors in a specific translation unit?

The problem we face is that for one code path the resulting -- thoroughly measured -- performance is better (by about 5%) if the copy-ctor calls of one object are not inlined, that is if this constructor is implemented manually. (We noticed this because during code-cleanup the superfluous explicitly implemented copy ctor of this class (17 members) was removed.)

Edit: Note that we have checked the generated assembly code and have made sure that the inlining and code generation is happening as I describe for the two different code versions.

We face now the choice of just dropping the manual copy-ctor code back in (it does exactly the same as the compiler generated one) or finding any other means of not inlining the copy ctor of this class.

Is there any means (for Microsoft Visual C++) to explicitly instantiate the compiler generated class functions in a specific translation unit or will they always be inlined in each translation unit where they are used? (Comments for gcc or other compilers are also welcome to get a better picture of the situation.)

Since the first 2 answers show some misunderstanding: The compiler generated class functions are only generated by the compiler itself if they are neither declared nor defined by the user. Therefore no modifiers whatsoever can be applied to them, since these function do not exist in the sourcecode.

struct A {
  std::string member;
};

A has a default and copy ctor, a dtor and a copy operator. Neither of these function can be modified via some declspec because they do not exist in the code.

struct B {
  std::string member;
  B(B const& rhs);
};

B now has a user supplied copy ctor and the user has to implement it. The compiler will not generate code for it.

Some more background for the doubters :-) ...

This code is compiled using MS Visual C++, but it is linked for an embedded(-like) (realtime) system. Performance was measured by taking timings on this system and I therefore think the guys who took the timings will have some decent numbers.

The test was performed by comparing two code versions where the only difference was the inline vs. the not-inline copy ctor of this one class. Timings with the inlined code were worse by about 5%.

Further checking has revealed that I was mistaken in one point: The compiler will generate separate functions for complex copy constructors. It will do this on its own discretion and it also depends on the optimization settings. So in our case the compiler is doing the wrong thing in our specific circumstances. From the answers so far it does not appear we can tell the compiler otherwise. :-(

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

后eg是否自 2024-10-02 23:23:18

$12.1/5-“隐式声明的
默认构造函数是内联的
其类的公共成员。”。

所以我们无能为力。隐式构造函数必须是内联的。这方面的任何其他行为可能都是扩展话

虽如此，

很可能您的手动复制构造函数（例如，如果您的类中的成员之一（共 17 个）是指针成员，则手动复制构造函数可能会处理深层复制（并且。

因此，除非您仔细检查手动复制构造函数，否则不要考虑删除它并依赖（可能有错误的）隐式复制构造函数（在您的上下文中）

回复收藏 0 原文

寻找一个思念的角度 2024-10-02 23:23:18

我非常怀疑内联与它有什么关系。如果编译器内联编译器生成的复制构造函数，为什么它不也内联显式定义的复制构造函数呢？（编译器的优化启发式失败如此严重以至于使内联代码慢了 5% 也是不寻常的）

在得出结论之前，

请检查生成的程序集以验证两个版本实际上执行完全相同的操作（并且以相同的顺序，使用相同的程序集等等，否则这可能是性能差异的根源）
检查编译器生成的程序是否实际上正在内联，而手动定义的不是< /em>.

如果是这样，您能否用此信息更新您的问题？

C++ 中无法指示编译器生成的函数是否应该内联。即使是特定于供应商的扩展（例如 __declspec(noinline)）也无法帮助您，因为您明确地将函数的所有责任移交给了编译器。因此编译器选择如何处理它、如何实现它以及是否内联它。你不能既说“请为我实现这个功能”，又同时说“请让我控制该功能的实现方式”。如果您想控制该功能，则必须实现它。 ;)

在 C++0x 中，它可能是可能的（取决于这些特定于供应商的扩展如何与声明为 = default 的函数交互）。

但同样，我不相信内联是问题所在。最有可能的是，这两个函数只是导致生成不同的汇编代码。

回复收藏 0 原文

维持三分热 2024-10-02 23:23:18

__declspec(noinline)。

文档说它仅适用于成员函数，但实际上它也适用于自由函数。

回复收藏 0 原文

深爱不及久伴 2024-10-02 23:23:18

通常最好将其隔离为您知道有问题的几个核心类型。示例a：

class t_std_string {
    std::string d_string;
public:
    /* ... */

    /* defined explicitly, and out of line -- you know what to do here */
    t_std_string();
    t_std_string(const std::string& other);
    t_std_string(const t_std_string& other);
    ~t_std_string();

    inline std::string& get() { return this->d_string; }
    inline const std::string& get() const { return this->d_string; }
    /* ... */
};

struct B {
    t_std_string member;
    /* 16 more */
    /* ... */
};

或者您可以免费拿走一些。示例b：

/* B.hpp */

struct B {
private:

    /* class types */
    struct t_data {
        std::string member;

        /* 16 more ... */
    public:
        /* declare + implement the ctor B needs */

        /* since it is otherwise inaccessible, it will only hurt build times to make default ctor/dtor implicit (or by implementing them in the header, of course), so define these explicitly in the cpp file */
        t_data();
        ~t_data();

        /* allow implicit copy ctor and assign -- this could hurt your build times, however. it depends on the complexity/visibility of the implementation of the data and the number of TUs in which this interface is visible. since only one object needs this... it's wasteful in large systems */
    };
private:

    /* class data */
    t_data d_data;
public:
    /* you'll often want the next 4 out of line
       -- it depends on how this is created/copied/destroyed in the wild
     */
    B();
    B(const B& other);
    ~B();
    B& operator=(const B&);
};

/* B.cpp */

/* assuming these have been implemented properly for t_data */
B::B() : d_data() {
}

B::B(const B& other) : d_data(other) {
}

B::~B() {
}

B& B::operator=(const B&) {
    /* assuming the default behaviour is correct...*/
    this->d_data = other.d_data;
    return *this;
}
/* continue to B::t_data definitions */

it's often best to isolate it to a few core types which you know are problematic. example a:

class t_std_string {
    std::string d_string;
public:
    /* ... */

    /* defined explicitly, and out of line -- you know what to do here */
    t_std_string();
    t_std_string(const std::string& other);
    t_std_string(const t_std_string& other);
    ~t_std_string();

    inline std::string& get() { return this->d_string; }
    inline const std::string& get() const { return this->d_string; }
    /* ... */
};

struct B {
    t_std_string member;
    /* 16 more */
    /* ... */
};

or you can take some of it for free. example b:

/* B.hpp */

struct B {
private:

    /* class types */
    struct t_data {
        std::string member;

        /* 16 more ... */
    public:
        /* declare + implement the ctor B needs */

        /* since it is otherwise inaccessible, it will only hurt build times to make default ctor/dtor implicit (or by implementing them in the header, of course), so define these explicitly in the cpp file */
        t_data();
        ~t_data();

        /* allow implicit copy ctor and assign -- this could hurt your build times, however. it depends on the complexity/visibility of the implementation of the data and the number of TUs in which this interface is visible. since only one object needs this... it's wasteful in large systems */
    };
private:

    /* class data */
    t_data d_data;
public:
    /* you'll often want the next 4 out of line
       -- it depends on how this is created/copied/destroyed in the wild
     */
    B();
    B(const B& other);
    ~B();
    B& operator=(const B&);
};

/* B.cpp */

/* assuming these have been implemented properly for t_data */
B::B() : d_data() {
}

B::B(const B& other) : d_data(other) {
}

B::~B() {
}

B& B::operator=(const B&) {
    /* assuming the default behaviour is correct...*/
    this->d_data = other.d_data;
    return *this;
}
/* continue to B::t_data definitions */

回复收藏 0 原文

时光磨忆 2024-10-02 23:23:18

您可以使用某种嵌套对象。通过这种方式，嵌套对象的复制构造函数可以保留为免维护默认值，但您仍然有一个显式创建的复制构造函数，可以将其声明为 noinline。

class some_object_wrapper {
    original_object obj;
    __declspec(noinline) some_object_wrapper(const some_object_wrapper& ref) 
        : obj(ref) {}
    // Other function accesses and such here
};

如果您绝望，您可以在 .lib 中单独编译有问题的类并链接到它。将其更改为不同的翻译单元不会阻止 VC++ 内联它。另外，我不得不质疑他们是否真的在做同样的事情。如果手动复制构造函数与默认复制构造函数的作用相同，为什么还要实现它？

You could use some sort of nested object. In this way the nested object's copy constructor can be left as the maintenance-free default, but you still have an explicitly created copy constructor that you can declare noinline.

class some_object_wrapper {
    original_object obj;
    __declspec(noinline) some_object_wrapper(const some_object_wrapper& ref) 
        : obj(ref) {}
    // Other function accesses and such here
};

If you're desperate, you could compile the class in question separately in a .lib and link to it. Changing it to a different translation unit will not stop VC++ from inlining it. Also, I have to question as to whether they're actually doing the same thing. Why did you implement a manual copy constructor if it does the same as the default copy constructor?

回复收藏 0 原文