如何允许 C++ 的复制省略构造;类(不仅仅是 POD C 结构)
考虑以下代码:
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
struct B
{
T x;
};
#define MAKE_B(x) B<decltype(x)>{ x }
template <class T>
B<T> make_b(T&& x)
{
return B<T> { std::forward<T>(x) };
}
int main()
{
std::cout << "Macro make b" << std::endl;
auto b1 = MAKE_B( A() );
std::cout << "Non-macro make b" << std::endl;
auto b2 = make_b( A() );
}
输出如下:
宏使b
非宏制作b
移动
请注意,b1 的构造无需移动,但 b2 的构造需要移动。
我还需要类型推导,因为现实生活中的 A
可能是一个复杂的类型,很难显式地编写。我还需要能够嵌套调用(即 make_c(make_b(A()))
)。
这样的功能可以吗?
进一步的想法:
N3290 最终 C++0x 草案第 284 页:
这种复制/移动操作的省略, 称为复制省略,是允许的 以下情况:
当临时类对象具有 未绑定到引用 (12.2) 将被复制/移动到一个类 具有相同 cv-unqualified 的对象 类型,复制/移动操作可以是 通过构造临时省略 对象直接进入目标 省略复制/移动
不幸的是,这似乎我们无法删除函数参数到函数结果(包括构造函数)的副本(和移动),因为这些临时变量要么绑定到引用(当通过引用传递时),要么不再是临时变量(当传递时 按值)。创建复合对象时消除所有副本的唯一方法似乎是将其创建为聚合。但是,聚合有一定的限制,例如要求所有成员都是公共的,并且没有用户定义的构造函数。
我认为 C++ 允许对 POD C 结构聚合构造进行优化但不允许对非 POD C++ 类构造进行相同的优化是没有意义的。
有什么方法可以允许非聚合构造的复制/移动省略吗?
我的回答:
此构造允许省略非 POD 类型的副本。我从 David Rodríguez 的回答如下。它需要 C++11 lambda。在下面的示例中,我更改了 make_b
以采用两个参数,以使事情变得不那么琐碎。没有调用任何移动或复制构造函数。
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
class B
{
public:
template <class LAMBDA1, class LAMBDA2>
B(const LAMBDA1& f1, const LAMBDA2& f2) : x1(f1()), x2(f2())
{
std::cout
<< "I'm a non-trivial, therefore not a POD.\n"
<< "I also have private data members, so definitely not a POD!\n";
}
private:
T x1;
T x2;
};
#define DELAY(x) [&]{ return x; }
#define MAKE_B(x1, x2) make_b(DELAY(x1), DELAY(x2))
template <class LAMBDA1, class LAMBDA2>
auto make_b(const LAMBDA1& f1, const LAMBDA2& f2) -> B<decltype(f1())>
{
return B<decltype(f1())>( f1, f2 );
}
int main()
{
auto b1 = MAKE_B( A(), A() );
}
如果有人知道如何更巧妙地实现这一点,我会很有兴趣看到它。
之前的讨论:
这在某种程度上是根据以下问题的答案得出的:
Consider the following code:
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
struct B
{
T x;
};
#define MAKE_B(x) B<decltype(x)>{ x }
template <class T>
B<T> make_b(T&& x)
{
return B<T> { std::forward<T>(x) };
}
int main()
{
std::cout << "Macro make b" << std::endl;
auto b1 = MAKE_B( A() );
std::cout << "Non-macro make b" << std::endl;
auto b2 = make_b( A() );
}
This outputs the following:
Macro make b
Non-macro make b
Move
Note that b1 is constructed without a move, but the construction of b2 requires a move.
I also need to type deduction, as A
in real life usage may be a complex type which is difficult to write explicitly. I also need to be able to nest calls (i.e. make_c(make_b(A()))
).
Is such a function possible?
Further thoughts:
N3290 Final C++0x draft page 284:
This elision of copy/move operations,
called copy elision, is permitted in
the following circumstances:when a temporary class object that has
not been bound to a reference (12.2)
would be copied/moved to a class
object with the same cv-unqualified
type, the copy/move operation can be
omitted by constructing the temporary
object directly into the target of the
omitted copy/move
Unfortunately this seems that we can't elide copies (and moves) of function parameters to function results (including constructors) as those temporaries are either bound to a reference (when passed by reference) or no longer temporaries (when passed by value). It seems the only way to elide all copies when creating a composite object is to create it as an aggregate. However, aggregates have certain restrictions, such as requiring all members be public, and no user defined constructors.
I don't think it makes sense for C++ to allow optimizations for POD C-structs aggregate construction but not allow the same optimizations for non-POD C++ class construction.
Is there any way to allow copy/move elision for non-aggregate construction?
My answer:
This construct allows for copies to be elided for non-POD types. I got this idea from David Rodríguez's answer below. It requires C++11 lambdas. In this example below I've changed make_b
to take two arguments to make things less trivial. There are no calls to any move or copy constructors.
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
class B
{
public:
template <class LAMBDA1, class LAMBDA2>
B(const LAMBDA1& f1, const LAMBDA2& f2) : x1(f1()), x2(f2())
{
std::cout
<< "I'm a non-trivial, therefore not a POD.\n"
<< "I also have private data members, so definitely not a POD!\n";
}
private:
T x1;
T x2;
};
#define DELAY(x) [&]{ return x; }
#define MAKE_B(x1, x2) make_b(DELAY(x1), DELAY(x2))
template <class LAMBDA1, class LAMBDA2>
auto make_b(const LAMBDA1& f1, const LAMBDA2& f2) -> B<decltype(f1())>
{
return B<decltype(f1())>( f1, f2 );
}
int main()
{
auto b1 = MAKE_B( A(), A() );
}
If anyone knows how to achieve this more neatly I'd be quite interested to see it.
Previous discussion:
This somewhat follows on from the answers to the following questions:
Can creation of composite objects from temporaries be optimised away?
Avoiding need for #define with expression templates
Eliminating unnecessary copies when building composite objects
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您无法优化将
A
对象从make_b
参数复制/移动到创建的B
对象成员的操作。然而,这就是移动语义的全部要点——通过为
A
提供轻量级移动操作,您可以避免可能昂贵的副本。例如,如果A
实际上是std::vector
,则可以通过使用移动构造函数来避免复制向量内容,而只需使用内务指针将被转移。You cannot optimize out the copy/move of the
A
object from the parameter ofmake_b
to the member of the createdB
object.However, this is the whole point of move semantics --- by providing a light-weight move operation for
A
you can avoid a potentially expensive copy. e.g. ifA
was actuallystd::vector<int>
, then the copying of the vector's contents can be avoided by use of the move constructor, and instead just the housekeeping pointers will be transferred.这不是什么大问题。它所需要的只是稍微改变代码的结构。
而不是:
你总是可以这样做:
如果 B 的构造函数是这样的,那么它不会接受任何副本:
复合情况也将起作用:
它甚至不需要 c++0x 功能来执行此操作。
This isn't a big problem. All it needs is changing the structure of the code slightly.
Instead of:
You can always do:
If the constructor of B is like this, then it'll not take any copies:
The composite case will work too:
And it doesn't even need c++0x features to do this.
正如安东尼已经提到的,该标准禁止从函数的参数到同一函数的返回的复制省略。推动该决定的基本原理是,复制省略(和移动省略)是一种优化,通过这种优化,程序中的两个对象被合并到同一内存位置,也就是说,通过使两个对象成为一个来消除复制。下面是(部分)标准引用,后面是允许复制省略的一系列情况,其中不包括该特定情况。
那么是什么让这个特殊案例有所不同呢?区别基本上在于,原始对象和复制对象之间存在函数调用,而函数调用意味着需要考虑额外的约束,特别是调用约定。
给定一个函数
T foo( T )
,并且用户调用T x = foo( T(param) );
,在一般情况下,通过单独编译,编译器将在调用约定要求第一个参数所在的位置创建一个对象$tmp1
。然后它将调用该函数并从 return 语句初始化x
。这是复制省略的第一个机会:小心地将x
放在返回的临时文件x
和从foo
返回的对象的位置上成为单个对象,并且该副本被删除。到目前为止,一切都很好。问题在于,调用约定通常不会将返回的对象和参数放在同一位置,因此,$tmp1
和x
不能是单个内存中的位置。如果没有看到函数定义,编译器不可能知道函数参数的唯一目的是充当 return 语句,因此它无法删除额外的副本。可以说,如果函数是内联的,那么编译器将缺少额外的信息来理解用于调用函数的临时值、返回值和 x 是单个对象。问题是,只有当代码实际内联时(不仅是标记为内联,而且实际内联),如果需要函数调用,才能删除该特定副本,则副本无法被删除。如果标准允许在内联代码时删除该副本,则意味着程序的行为将因编译器而不是用户代码而有所不同 -
inline
关键字不会强制内联,它仅意味着同一函数的多个定义并不代表违反 ODR。请注意,如果变量是在函数内部创建(与传递给函数相比),如下所示:
T foo() { T tmp; ...;返回tmp; } T x = foo();
那么两个副本都可以被省略:对于必须创建tmp
的位置没有限制(它不是函数的输入或输出参数因此编译器能够将其重新定位到任何地方,包括返回类型的位置,并且在调用方,x
可以像前面的示例一样仔细地定位在同一 return 语句的位置,这基本上意味着tmp
, return 语句和 x 可以是单个对象,就您的特定问题而言,如果您使用宏,则代码是内联的,对对象没有限制,并且可以省略副本。但是,如果添加函数,则无法从参数中删除返回语句的副本,因此不要使用将移动对象的模板,而是创建一个将移动对象的模板。 >构造一个对象:
该副本可以是 请注意,我没有处理
移动构造,因为您似乎担心移动构造的成本,尽管我相信您在错误的树上咆哮。考虑到一个令人兴奋的真实用例,我确信这里的人们会想出一些有效的想法。
12.8/31
As Anthony has already mentioned, the standard forbids copy elision from the argument of a function to the return of the same function. The rationale that drives that decision is that copy elision (and move elision) is an optimization by which two objects in the program are merged into the same memory location, that is, the copy is elided by having both objects be one. The (partial) standard quote is below, followed by a set of circumstances under which copy elision is allowed, which do not include that particular case.
So what makes that particular case different? The difference is basically that the fact that there is a function call between the original and the copied objects, and the function call implies that there are extra constraints to consider, in particular the calling convention.
Given a function
T foo( T )
, and a user callingT x = foo( T(param) );
, in the general case, with separate compilation, the compiler will create an object$tmp1
in the location that the calling convention requires the first argument to be. It will then call the function and initializex
from the return statement. Here is the first opportunity for copy elision: by carefully placingx
on the location where the returned temporary is,x
and the returned object fromfoo
become a single object, and that copy is elided. So far so good. The problem is that the calling convention in general will not have the returned object and the parameter in the same location, and because of that,$tmp1
andx
cannot be a single location in memory.Without seeing the function definition the compiler cannot possibly know that the only purpose of the argument to the function is to serve as return statement, and as such it cannot elide that extra copy. It can be argued that if the function is
inline
then the compiler would have the missing extra information to understand that the temporary used to call the function, the returned value andx
are a single object. The problem is that that particular copy can only be elided if the code is actually inlined (not only if it is marked asinline
but actually inlined) If a function call is required, then the copy cannot be elided. If the standard allowed that copy to be elided when the code is inlined, it would imply that the behavior of a program would differ due to the compiler and not user code --theinline
keyword does not force inlining, it only means that multiple definitions of the same function do not represent a violation of the ODR.Note that if the variable was created inside the function (as compared to passed into it) as in:
T foo() { T tmp; ...; return tmp; } T x = foo();
then both copies can be elided: There is no restriction as of wheretmp
has to be created (it is not an input or output parameter to the function so the compiler is able to relocate it anywhere, including the location of the returned type, and on the calling side,x
can as in the previous example be carefully located in the location of that same return statement, which basically means thattmp
, the return statement andx
can be a single object.As of your particular problem, if you resort to a macro, the code is inlined, there are no restrictions on the objects and the copy can be elided. But if you add a function, you cannot elide the copy from the argument to the return statement. So just avoid it. Instead of using a template that will move the object, create a template that will construct an object:
And that copy can be elided by the compiler.
Note that I have not dealt with move construction, as you seem concerned on the cost of even move construction, even though I believe that you are barking at the wrong tree. Given a motivating real use case, I am quite sure that people here will come up with a couple of efficient ideas.
12.8/31
不,事实并非如此。允许编译器省略移动;是否会发生这种情况是特定于实现的,取决于几个因素。也可以移动,但不能复制(这种情况必须使用移动,不能使用复制)。
确实,我们不能保证您的移动会被忽略。如果必须保证不会发生任何移动,则可以使用宏或研究实现的选项来控制此行为,特别是函数内联。
No, it doesn't. The compiler is allowed to elide the move; whether that happens is implementation-specific, depending on several factors. It is also allowed to move, but it cannot copy (moving must be used instead of copying in this situation).
It is true that you are not guaranteed that the move will be elided. If you must be guaranteed that no move will occur, then either use the macro or investigate your implementation's options to control this behavior, particularly function inlining.