如果我们已经有了 RVO，移动语义会提供什么优化？

发布于 2024-10-18 01:43:12 字数 932 浏览 2 评论 0原文

据我了解，添加移动语义的目的之一是通过调用特殊构造函数来复制“临时”对象来优化代码。例如，在这个答案中我们看到它可以用来优化诸如字符串 a = x + y 之类的东西。因为 x+y 是右值表达式，所以我们可以只复制指向字符串的指针和字符串的大小，而不是深度复制。但正如我们所知，现代编译器支持返回值优化，因此如果不使用移动语义，我们的代码将不会根本调用复制构造函数。

为了证明这一点，我编写了这段代码：

#include <iostream>

struct stuff
{
        int x;
        stuff(int x_):x(x_){}
        stuff(const stuff & g):x(g.x)
        {
                std::cout<<"copy"<<std::endl;
        }
};   
stuff operator+(const stuff& lhs,const stuff& rhs)
{
        stuff g(lhs.x+rhs.x);
        return g;
}
int main()
{
        stuff a(5),b(7);
        stuff c = a+b;
}

在 VC++2010 和 g++ 的优化模式下执行它后，我得到空输出。

如果没有它我的代码仍然运行得更快，它是什么样的优化？你能解释一下我理解错误的地方吗？

原文

As far as I understand one of the purposes of adding move semantics is to optimize code by calling special constructor for copying "temporary" objects. For example, in this answer we see that it can be used to optimize such string a = x + y stuff. Because x+y is an rvalue expression, instead of deep copying we can copy only the pointer to the string and the size of the string. But as we know, modern compilers support return value optimization, so without using move semantics our code will not call the copy constructor at all.

To prove it I write this code:

#include <iostream>

struct stuff
{
        int x;
        stuff(int x_):x(x_){}
        stuff(const stuff & g):x(g.x)
        {
                std::cout<<"copy"<<std::endl;
        }
};   
stuff operator+(const stuff& lhs,const stuff& rhs)
{
        stuff g(lhs.x+rhs.x);
        return g;
}
int main()
{
        stuff a(5),b(7);
        stuff c = a+b;
}

And after executing it in VC++2010 and g++ in optimize mode I'm getting empty output.

What kind of optimization is it, if without it my code still works faster? Could you explain what I'm understanding wrong?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

阳光下慵懒的猫 2024-10-25 01:43:12

移动语义不应被视为优化设备，即使它们可以这样使用。

如果您想要对象的副本（函数参数或返回值），那么 RVO 和复制省略将尽可能完成这项工作。移动语义可以提供帮助，但比这更强大。

当您想做一些不同的事情时，无论传递的对象是临时对象（然后它绑定到右值引用）还是具有名称的“标准”对象，移动语义都很方便（所谓的const lvalue）。例如，如果您想要窃取临时对象的资源，那么您需要移动语义（例如：您可以窃取 std::unique_ptr 指向的内容）。

移动语义允许您从函数返回不可复制的对象，这在当前标准中是不可能的。此外，不可复制的对象可以放入其他对象内，并且如果所包含的对象是可移动的，则这些对象将自动可移动。

不可复制的对象很棒，因为它们不会强迫您实现容易出错的复制构造函数。很多时候，复制语义并没有真正意义，但移动语义却有意义（想一想）。

即使 T 不可复制，这也使您能够使用可移动的 std::vector 类。在处理不可复制对象（例如多态对象）时，std::unique_ptr 类模板也是一个很好的工具。

回复收藏 0 原文

野生奥特曼 2024-10-25 01:43:12

经过一番挖掘，我在 Stroustrup 中找到了这个使用右值引用进行优化的出色示例常见问题解答。

是的，交换函数：

    template<class T> 
void swap(T& a, T& b)   // "perfect swap" (almost)
{
    T tmp = move(a);    // could invalidate a
    a = move(b);        // could invalidate b
    b = move(tmp);      // could invalidate tmp
}

这将为任何类型生成优化的代码（假设它有移动构造函数）。

编辑： RVO 也无法优化这样的东西（至少在我的编译器上）：

stuff func(const stuff& st)
{
    if(st.x>0)
    {
        stuff ret(2*st.x);
        return ret;
    }
    else
    {
        stuff ret2(-2*st.x);
        return ret2;
    }
}

这个函数总是调用复制构造函数（用 VC++ 检查）。如果我们的类可以比移动构造函数更快地移动，我们将获得优化。

After some digging I find this excellent example of optimization with rvalue references inStroustrup's FAQ .

Yes, swap function:

    template<class T> 
void swap(T& a, T& b)   // "perfect swap" (almost)
{
    T tmp = move(a);    // could invalidate a
    a = move(b);        // could invalidate b
    b = move(tmp);      // could invalidate tmp
}

This will generate optimized code for any kind of types (assuming, that it have move constructor).

Edit: Also RVO can't optimize something like this(at least on my compiler):

stuff func(const stuff& st)
{
    if(st.x>0)
    {
        stuff ret(2*st.x);
        return ret;
    }
    else
    {
        stuff ret2(-2*st.x);
        return ret2;
    }
}

This function always calls copy constructor (checked with VC++). And if our class can be moved faster, than with move constructor we will have optimization.

回复收藏 0 原文

新雨望断虹 2024-10-25 01:43:12

想象一下，你的东西是一个像字符串一样具有堆分配内存的类，并且它具有容量的概念。给它一个运算符+=，容量就会以几何级数增长。在 C++03 中，这可能看起来像：

#include <iostream>
#include <algorithm>

struct stuff
{
    int size;
    int cap;

    stuff(int size_):size(size_)
    {
        cap = size;
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    stuff(const stuff & g):size(g.size), cap(g.cap)
    {
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    ~stuff()
    {
        if (cap > 0)
            std::cout << "deallocating " << cap << '\n';
    }

    stuff& operator+=(const stuff& y)
    {
        if (cap < size+y.size)
        {
            if (cap > 0)
                std::cout << "deallocating " << cap << '\n';
            cap = std::max(2*cap, size+y.size);
            std::cout <<"allocating " << cap <<std::endl;
        }
        size += y.size;
        return *this;
    }
};

stuff operator+(const stuff& lhs,const stuff& rhs)
{
    stuff g(lhs.size + rhs.size);
    return g;
}

还想象一下你想一次添加两个以上的东西：

int main()
{
    stuff a(11),b(9),c(7),d(5);
    std::cout << "start addition\n\n";
    stuff e = a+b+c+d;
    std::cout << "\nend addition\n";
}

对我来说，这打印出来：

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
allocating 27
allocating 32
deallocating 27
deallocating 20

end addition
deallocating 32
deallocating 5
deallocating 7
deallocating 9
deallocating 11

我计算了 3 次分配和 2 次释放来计算：

stuff e = a+b+c+d;

现在添加移动语义：

    stuff(stuff&& g):size(g.size), cap(g.cap)
    {
        g.cap = 0;
        g.size = 0;
    }

...

stuff operator+(stuff&& lhs,const stuff& rhs)
{
        return std::move(lhs += rhs);
}

运行我再次得到：

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
deallocating 20
allocating 40

end addition
deallocating 40
deallocating 5
deallocating 7
deallocating 9
deallocating 11

我现在减少到 2 次分配和 1 次释放。这意味着更快的代码。

Imagine your stuff was a class with heap allocated memory like a string, and that it had the notion of capacity. Give it a operator+= that will grow the capacity geometrically. In C++03 this might look like:

#include <iostream>
#include <algorithm>

struct stuff
{
    int size;
    int cap;

    stuff(int size_):size(size_)
    {
        cap = size;
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    stuff(const stuff & g):size(g.size), cap(g.cap)
    {
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    ~stuff()
    {
        if (cap > 0)
            std::cout << "deallocating " << cap << '\n';
    }

    stuff& operator+=(const stuff& y)
    {
        if (cap < size+y.size)
        {
            if (cap > 0)
                std::cout << "deallocating " << cap << '\n';
            cap = std::max(2*cap, size+y.size);
            std::cout <<"allocating " << cap <<std::endl;
        }
        size += y.size;
        return *this;
    }
};

stuff operator+(const stuff& lhs,const stuff& rhs)
{
    stuff g(lhs.size + rhs.size);
    return g;
}

Also imagine you want to add more than just two stuff's at a time:

int main()
{
    stuff a(11),b(9),c(7),d(5);
    std::cout << "start addition\n\n";
    stuff e = a+b+c+d;
    std::cout << "\nend addition\n";
}

For me this prints out:

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
allocating 27
allocating 32
deallocating 27
deallocating 20

end addition
deallocating 32
deallocating 5
deallocating 7
deallocating 9
deallocating 11

I count 3 allocations and 2 deallocations to compute:

stuff e = a+b+c+d;

Now add move semantics:

    stuff(stuff&& g):size(g.size), cap(g.cap)
    {
        g.cap = 0;
        g.size = 0;
    }

...

stuff operator+(stuff&& lhs,const stuff& rhs)
{
        return std::move(lhs += rhs);
}

Running again I get:

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
deallocating 20
allocating 40

end addition
deallocating 40
deallocating 5
deallocating 7
deallocating 9
deallocating 11

I'm now down to 2 allocations and 1 deallocations. That translates to faster code.

回复收藏 0 原文

凌乱心跳 2024-10-25 01:43:12

有很多地方，其他答案中也提到了一些。

一大问题是，当调整 std::vector 的大小时，它会将移动感知对象从旧内存位置移动到新内存位置，而不是复制并销毁原始内存位置。

此外，右值引用允许可移动类型的概念，这是语义差异而不仅仅是优化。 unique_ptr 在 C++03 中是不可能的，这就是为什么我们有对 auto_ptr 的厌恶。

回复收藏 0 原文

百变从容 2024-10-25 01:43:12

仅仅因为现有优化已经涵盖了这种特殊情况，并不意味着不存在右值引用有用的其他情况。

即使从无法内联的函数（可能是虚拟调用或通过函数指针）返回临时值，移动构造也可以进行优化。

回复收藏 0 原文

懒的傷心 2024-10-25 01:43:12

您发布的示例仅采用 const 左值引用，因此明确不能对其应用移动语义，因为其中没有单个右值引用。当您实现没有右值引用的类型时，移动语义如何使您的代码更快？

此外，您的代码已被 RVO 和 NRVO 覆盖。移动语义适用于比这两者更多的情况。

回复收藏 0 原文

温柔少女心 2024-10-25 01:43:12

该行调用第一个构造函数。

stuff a(5),b(7);

使用显式公共左值引用来调用加运算符。

stuff c = a + b;

在运算符重载方法内，没有调用复制构造函数。
同样，仅调用第一个构造函数。

stuff g(lhs.x+rhs.x);

分配是通过 RVO 进行的，因此不需要副本。不需要从返回的对象到“c”的复制。

stuff c = a+b;

由于没有 std::cout 引用，编译器会注意您的 c 值从未被使用。然后，整个程序被删除，产生一个空程序。

This line calls the first constructor.

stuff a(5),b(7);

Plus operator is called using explicit common lvalue references.

stuff c = a + b;

Inside operator overload method, you have no copy constructor called.
Again, the first constructor is called only.

stuff g(lhs.x+rhs.x);

assigment is made with RVO, so no copy is need. NO copy from returned object to 'c' is need.

stuff c = a+b;

Due no std::cout reference, compiler take care about your c value is never used. Then, whole program is stripped out, resulting in a empty program.

回复收藏 0 原文

落墨 2024-10-25 01:43:12

我能想到的另一个很好的例子。想象一下，您正在实现一个矩阵库并编写一个算法，该算法采用两个矩阵并输出另一个矩阵：

Matrix MyAlgorithm(Matrix U, Matrix V)
{
    Transform(U); //doesn't matter what this actually does, but it modifies U
    Transform(V);
    return U*V;
}

请注意，您不能通过 const 引用传递 U 和 V，因为该算法会调整它们。理论上，您可以通过引用传递它们，但这看起来很恶心，并使 U 和 V 处于某种中间状态（因为您调用 Transform(U))，这对调用者来说可能没有任何意义，或者根本没有任何数学意义，因为它只是内部算法转换之一。如果您只是按值传递它们并使用移动语义（如果您在调用此函数后不打算使用 U 和 V 的话），代码看起来会干净得多：

Matrix u, v;
...
Matrix w = MyAlgorithm(u, v); //slow, but will preserve u and v
Matrix w = MyAlgorithm(move(u), move(v)); //fast, but will nullify u and v
Matrix w = MyAlgorithm(u, move(v)); //and you can even do this if you need one but not the other

Another good example I can think of. Imagine that you're implementing a matrix library and write an algorithm which takes two matrices and outputs another one:

Matrix MyAlgorithm(Matrix U, Matrix V)
{
    Transform(U); //doesn't matter what this actually does, but it modifies U
    Transform(V);
    return U*V;
}

Note that you can't pass U and V by const reference, because the algorithm tweaks them. You could theoretically pass them by reference, but this would look gross and leave U and V in some intermediate state (since you call Transform(U)), which may not make any sense to the caller, or just not make any mathematical sense at all, since it's just one of the internal algorithm transformations. The code looks much cleaner if you just pass them by value and use move semantics if you are not going to use U and V after calling this function:

Matrix u, v;
...
Matrix w = MyAlgorithm(u, v); //slow, but will preserve u and v
Matrix w = MyAlgorithm(move(u), move(v)); //fast, but will nullify u and v
Matrix w = MyAlgorithm(u, move(v)); //and you can even do this if you need one but not the other

回复收藏 0 原文

~没有更多了~