C++0x 闭包的未定义行为:II

发布于 2024-10-30 21:15:18 字数 4134 浏览 4 评论 0 原文

我发现 C++0x 闭包的使用令人困惑。我的初始报告< /a>,以及后续的,已经产生了更多的混乱解释。下面我将向您展示麻烦的示例,希望找出代码中存在未定义行为的原因。所有代码片段都通过了 gcc 4.6.0 编译器,没有任何警告。

程序 1:有效

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [=](int y) -> int { 
            return x+y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

输出符合预期:

2 2 2

2 2 2

2 2 2

2. 程序 2:关闭,工作正常

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

输出为:

4 3 2

7 6 5

10 9 8

程序 3:带有 std::function 的程序 1 ,工作正常

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [=](int y) -> int { 
            return x+y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}   

输出为:

2 2 2

2 2 2

2 2 2

程序 4:带有 std::function 的程序 2,未定义行为

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

程序的第一次运行给出:

4 3 2

4 3 2

12364812 12364811 12364810

同一程序的第二次运行:

>4 3 2

4 3 2

1666060 1666059 1666058

第三个:

4 3 2

4 3 2 strong>

2182156 2182155 2182154

我对 std::function 的使用如何破坏代码?为什么程序 1 - 3 运行良好,而程序 4 在调用 ac(1) 三次(!)时是正确的?为什么程序 4 在接下来的三种情况下陷入困境,就好像变量 x 是通过值而不是引用捕获的一样。 ac(1) 的最后三个调用是完全不可预测的,就好像对 x 的任何引用都会丢失一样。

I find the use of the C++0x closure perplexing. My initial report, and the subsequent one, have generated more confusion than explanations. Below I will show you troublesome examples, and I hope to find out why there is an undefined behavior in the code. All the pieces of the code pass the gcc 4.6.0 compiler without any warning.

Program No. 1: It Works

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [=](int y) -> int { 
            return x+y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The output meets the expectations:

2 2 2

2 2 2

2 2 2

2. Program No. 2: Closure, Works Fine

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The output is:

4 3 2

7 6 5

10 9 8

Program 3: Program No. 1 with std::function, Works Fine

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [=](int y) -> int { 
            return x+y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}   

The output is:

2 2 2

2 2 2

2 2 2

Program 4: Program No. 2 with std::function, Undefined Behavior

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The first run of the program gives:

4 3 2

4 3 2

12364812 12364811 12364810

The second run of the same program:

4 3 2

4 3 2

1666060 1666059 1666058

The third one:

4 3 2

4 3 2

2182156 2182155 2182154

How does my use of the std::function break the code? why do Programs No.1 - 3 work well, and Program No. 4 is correct when calling ac(1) thrice(!)? Why does Program No. 4 get stuck on the next three cases as if the variable x had been captured by value, not reference. And the last three calls of ac(1) are totally unpredictable as if any reference to x would be lost.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

两仪 2024-11-06 21:15:18

我希望找出为什么会有
代码中未定义的行为

每次我处理复杂且错综复杂的 lambda 时,我都觉得首先将其转换为函数对象形式会更容易。因为 lambda 只是函数对象的语法糖,并且对于每个 lambda 都存在与相应函数对象的一对一映射。这篇文章很好地解释了如何进行翻译:
http://blogs.msdn.com/b/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10 -part-1.aspx

因此,例如,您的程序 2 :

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

将被编译器大致翻译为这个:

#include <iostream>

struct InnerAccumulator
{
    int& x;
    InnerAccumulator(int& x):x(x)
    {
    }
    int operator()(int y) const
    {
        return x+=y;
    }
};

struct Accumulator
{
    InnerAccumulator operator()(int x) const
    {
        return InnerAccumulator(x); // constructor
    }
};


int main()
{
    Accumulator accumulator;
    InnerAccumulator ac = accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

现在,问题变得非常明显:

InnerAccumulator operator()(int x) const
{
   return InnerAccumulator(x); // constructor
}

这里 InnerAccumulator 的构造函数将引用 x ,一个局部变量,一旦退出operator()作用域,它就会消失。所以是的,正如您所怀疑的那样,您只会得到一个简单的、良好的、未定义的行为。

I hope to find out why there is an
undefined behavior in the code

Every time I deal with complex and intricated lambda, I feel it more easier to do first the translation into function-object form. Because lambdas are just syntactic sugar for function-object and for each lambda there is a one-to-one mapping with a corresponding function-object. This article explain really well how to do the translation :
http://blogs.msdn.com/b/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx

So for example, your program no 2 :

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

would be approximately translate by the compiler into this one :

#include <iostream>

struct InnerAccumulator
{
    int& x;
    InnerAccumulator(int& x):x(x)
    {
    }
    int operator()(int y) const
    {
        return x+=y;
    }
};

struct Accumulator
{
    InnerAccumulator operator()(int x) const
    {
        return InnerAccumulator(x); // constructor
    }
};


int main()
{
    Accumulator accumulator;
    InnerAccumulator ac = accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

And now, the problem become quite obvious :

InnerAccumulator operator()(int x) const
{
   return InnerAccumulator(x); // constructor
}

Here the constructor of InnerAccumulator will take a reference to x, a local variable which will die as soon as you exit the operator() scope. So yes, you just get a plain good old undefined behavior as you suspected.

猫瑾少女 2024-11-06 21:15:18

让我们尝试一些看起来完全无辜的事情:

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);

    //// Surely this should be a no-op? 
    accumulator(666);
    //// There are no side effects and we throw the result away!

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 
}

Tada:

669 668 667 
672 671 670 
675 674 673 

当然,这也不是保证的行为。事实上,启用优化后,gcc 将消除 Accumulator(666) 调用,认为它是死代码,我们再次获得原始结果。这样做完全是它的权利;在符合标准的程序中,删除调用确实不会影响语义。但在未定义行为的领域,任何事情都可能发生。


编辑

auto ac=accumulator(1);

std::cout << pow(2,2) << std::endl;

std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 

如果不启用优化,我会得到以下结果:

4
1074790403 1074790402 1074790401 
1074790406 1074790405 1074790404 
1074790409 1074790408 1074790407 

启用优化后,

4
4 3 2 
7 6 5 
10 9 8

再次,C++ 不会也不能提供真正的词法局部变量的生命周期将超出其原始范围的闭包。这需要将垃圾收集和基于堆的局部变量引入该语言。

不过,这完全是学术性的,因为通过复制捕获 x 可以使程序定义明确并按预期工作:

auto accumulator = [](int x) {
    return [x](int y) mutable -> int { 
        return x += y;
    }; 
};

Let's try something entirely innocent-looking:

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);

    //// Surely this should be a no-op? 
    accumulator(666);
    //// There are no side effects and we throw the result away!

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 
}

Tada:

669 668 667 
672 671 670 
675 674 673 

Of course, this is not guaranteed behaviour either. Indeed, with optimizations enabled, gcc will eliminate the accumulator(666) call figuring it's dead code, and we again get the original results. And it is entirely within its rights to do so; in a conforming program, removing the call would indeed not affect the semantics. But in the realm of undefined behaviour, anything may happen.


EDIT

auto ac=accumulator(1);

std::cout << pow(2,2) << std::endl;

std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 

Without optimizations enabled, I get the following:

4
1074790403 1074790402 1074790401 
1074790406 1074790405 1074790404 
1074790409 1074790408 1074790407 

With optimizations enabled,

4
4 3 2 
7 6 5 
10 9 8

Again, C++ does not and cannot provide true lexical closures where the lifetime of local variables would get extended beyond their original scope. That would entail bringing garbage collection and heap-based locals to the language.

This is all rather academic, though, as capturing x by copy makes the program well-defined and to work as expected:

auto accumulator = [](int x) {
    return [x](int y) mutable -> int { 
        return x += y;
    }; 
};
你列表最软的妹 2024-11-06 21:15:18

好吧,当参照物消失时,参照物就会悬空。如果对象 A 引用了对象 B 的某些部分,那么您的设计就非常脆弱,除非对象 A 以某种方式可以保证对象 B 的生命周期(例如,当 A 无论如何都持有一个指向 B 的共享指针,或者两者都在范围相同)。

lambda 中的引用也不例外。如果您打算返回对x+=y的引用,最好确保x存在足够长的时间。这里是作为调用 accumulator(1) 的一部分初始化的参数 int x。函数参数的生命周期在函数返回时结束。

Well, references become dangling when the referent goes away. You have a very fragile design if object A has a reference to some part of object B, unless object A in some way can guarantee the lifetime of object B (for instance, when A holds a shared_ptr to B anyway, or both are in the same scope).

References in lambda's are no magical exception. If you plan to return a reference to x+=y, you'd better make sure that x lives long enough. Here it's the argument int x initialized as part of the call accumulator(1). The lifetime of a function argument ends when the function returns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文