主流编译器是否将引用传递基本类型转换为复制传递?

发布于 2024-12-13 03:06:54 字数 708 浏览 2 评论 0原文

通过引用传递对象是向其传递地址的一种更简单、更快速且更安全的方法。 但对于大多数编译器来说,都是一样的:引用实际上是指针。

现在像 int 这样的基本类型呢?将地址传递给 int 并在函数内使用它会比通过复制传递它慢,因为在使用之前需要取消引用指针。

现代编译器如何处理这个?

int foo(const int & i)
{
   cout << i; // Do whatever read-only with i.
}

我可以相信他们会把这个编译成这个吗?

int foo(const int i)
{
   cout << i;
}

顺便说一句,在某些情况下,同时传递 i&i 甚至可能更快,然后使用 i 进行读取,然后 < code>*i 用于书写。

int foo(const int i, int * ptr_i)
{
   cout << i;    // no dereferencement, therefore faster (?)
   // many more read-only operations with i.
   *ptr_i = 123;
}

Passing an object by reference is an easier, faster and safer way to pass an address to it.
But for most compilers, it's all the same: references are really pointers.

Now what about basic types like int? Passing an address to an int and using it inside a function would be slower than passing it by copy, because the pointer needs to be dereferenced before use.

How do modern compiler handle, this?

int foo(const int & i)
{
   cout << i; // Do whatever read-only with i.
}

May I trust them to compile this into this?

int foo(const int i)
{
   cout << i;
}

By the way, in some cases it could even be faster to pass both i and &i, then use i for reading, and *i for writing.

int foo(const int i, int * ptr_i)
{
   cout << i;    // no dereferencement, therefore faster (?)
   // many more read-only operations with i.
   *ptr_i = 123;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

笔芯 2024-12-20 03:06:55

gcc 似乎没有使用 -O3 (gcc 版本 4.7.2)进行此优化。使用 Gabriel 的代码,请注意 ok2 如何在索引到 vars 之前加载取消引用的地址,而 ok1 则不然。

好的1:


    .cfi_startproc
    subq    $40, %rsp
    .cfi_def_cfa_offset 48
    movslq  %edi, %rdi
    fildl   vars(,%rdi,4)
    fld %st(0)
    fsqrt
    fucomi  %st(0), %st
    jp  .L7
    fstp    %st(1)

ok2:


    .cfi_startproc
    subq    $40, %rsp
    .cfi_def_cfa_offset 48
    movslq  (%rdi), %rax
    fildl   vars(,%rax,4)
    fld %st(0)
    fsqrt
    fucomi  %st(0), %st
    jp  .L12
    fstp    %st(1)

gcc does not appear to do this optimization with -O3 (gcc version 4.7.2). Using Gabriel's code, note how ok2 loads a dereferenced address before indexing into vars while ok1 does not.

ok1:


    .cfi_startproc
    subq    $40, %rsp
    .cfi_def_cfa_offset 48
    movslq  %edi, %rdi
    fildl   vars(,%rdi,4)
    fld %st(0)
    fsqrt
    fucomi  %st(0), %st
    jp  .L7
    fstp    %st(1)


ok2:


    .cfi_startproc
    subq    $40, %rsp
    .cfi_def_cfa_offset 48
    movslq  (%rdi), %rax
    fildl   vars(,%rax,4)
    fld %st(0)
    fsqrt
    fucomi  %st(0), %st
    jp  .L12
    fstp    %st(1)

下壹個目標 2024-12-20 03:06:54

我可以相信他们会把这个编译成这个吗?
是的,你可以。[这里的“是”意味着不同,请阅读编辑部分,其中澄清了]

int foo(const int & i)

告诉编译器 i 是对类型常量整数的引用。
编译器可以执行优化,但只能按照假设规则执行优化。因此,您可以放心,对于您的程序,上述行为将与(将遵守 const 限定符)一样好:

int foo(const int i)

假设规则:

C++ 标准允许编译器执行任何优化,只要生成的可执行文件表现出相同的可观察行为,就像已满足标准的所有要求一样。

对于 Standerdese 粉丝:
C++03 1.9“程序执行:

需要一致的实现来模拟(仅)抽象机的可观察行为。

脚注说:

该规定有时被称为“假设”规则,因为实施可以自由地忽略本国际标准的任何要求,只要结果就像遵守了要求一样,只要可以确定程序的可观察行为。例如,如果实际实现可以推断出其值未被使用并且不会产生影响程序可观察行为的副作用,则实际实现不需要评估表达式的一部分。

编辑:
由于答案有些混乱,让我澄清一下:
优化不能在编译器上强制执行。因此编译器如何解释它取决于编译器。重要的是程序的可观察行为不会改变。

May I trust them to compile this into this?
Yes You can.[The Yes here means differently, Please read Edit section, Which clarify's]

int foo(const int & i)

Tells the compiler that i is an reference to type constant integer.
The compiler may perform optimizations but they are only allowed to perform optimizations following the As-If Rule. So you can be assured that for your program the behavior of the above will be as good as(the const qualifier will be respected):

int foo(const int i)

As-If Rule:

The C++ standard allows a compiler to perform any optimization, as long as the resulting executable exhibits the same observable behaviour as if all the requirements of the standard have been fulfilled.

For Standerdese fans:
C++03 1.9 "Program execution:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine.

And the Foot-Note says:

This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced.

EDIT:
Since there is some confusion about the answer let me clarify:
Optimizations cannot be enforced on the compiler.So How compiler interprets it depends on the compiler.The important thing is the observable behavior of the program will not change.

时光瘦了 2024-12-20 03:06:54

它不应该将其编译为该内容,因为它可能不正确。考虑:

int foo(const int &i, int *p)
{
   *p = 42;
   cout << i; // prints 42
   return 0;
}

int main()
{
   int x = 5;
   foo(x, &x);
   return 0;
}

int foo(const int i, int *p)
{
   *p = 42;
   cout << i; // prints 5
   return 0;
}

int main()
{
   int x = 5;
   foo(x, &x);
   return 0;
}

编译器如何知道这不会发生?它必须以某种方式能够分析不可能访问该变量来更改它,例如(1)有人拥有指针,(2)它可能是全局变量,(3)来自另一个线程。考虑到 C 的不安全性质,使用指针算术等,甚至保证函数无法获取指向变量的指针也可能是不可能的。

It shouldn't compile it into that because it might not be correct. Consider:

int foo(const int &i, int *p)
{
   *p = 42;
   cout << i; // prints 42
   return 0;
}

int main()
{
   int x = 5;
   foo(x, &x);
   return 0;
}

versus

int foo(const int i, int *p)
{
   *p = 42;
   cout << i; // prints 5
   return 0;
}

int main()
{
   int x = 5;
   foo(x, &x);
   return 0;
}

How does the compiler know that this won't happen? It would have to somehow be able to analyze that it is impossible to access that variable to change it, e.g. (1) someone having a pointer, (2) it might be a global variable, (3) from another thread. Given the unsafe nature of C, with pointer arithmetic and all, even guaranteeing that the function won't be able to get a pointer to the variable might be impossible.

风渺 2024-12-20 03:06:54

至少在我测试过的简单情况下,Visual Studio 2010 (Express) 确实如此。有人要测试gcc吗?

我已经测试了以下内容:

1。仅传递i

int vars[] = {1,2,3,12,3,23,1,213,231,1,21,12,213,21321,213,123213,213123};

int ok1(const int i){
    return sqrtl(vars[i]);
}

int ok2(const int & i){
    return sqrtl(vars[i]);
}

void main() {
    int i;
    std::cin >> i;
    //i = ok1(i);
    i = ok2(i);
    std::cout << i;
}

ASM:

i = ok1(i);
000D1014  mov         ecx,dword ptr [i]  
000D1017  fild        dword ptr vars (0D3018h)[ecx*4]  
000D101E  call        _CIsqrt (0D1830h)  
000D1023  call        _ftol2_sse (0D1840h) 

i = ok2(i);
013A1014  mov         ecx,dword ptr [i]  
013A1017  fild        dword ptr vars (13A3018h)[ecx*4]  
013A101E  call        _CIsqrt (13A1830h)  
013A1023  call        _ftol2_sse (13A1840h)

嗯,ASM 是相同的,毫无疑问执行了优化。

2.传递 i&i

让我们在这里考虑 @newacct 的 anser。

int vars[] = {1,2,3,12,3,23,1,213,231,1,21,12,213,21321,213,123213,213123};

int ok1(const int i, int * pi) {
    *pi = 2;
    return sqrtl(vars[i]);
}

int ok2(const int & i, int * pi) {
    *pi = 2;
    return sqrtl(vars[i]);
}

void main() {
    int i;
    int * pi = &i;
    std::cin >> i;
    i = ok1(i, pi);
    //i = ok2(i, pi);
    std::cout << i;
}

ASM:

i = ok1(i, pi);
00891014  mov         ecx,dword ptr [i]
00891017  fild        dword ptr vars (893018h)[ecx*4] // access vars[i] 
0089101E  call        _CIsqrt (891830h)  
00891023  call        _ftol2_sse (891840h)  

i = ok2(i, pi);
011B1014  fild        dword ptr [vars+8 (11B3020h)]   // access vars[2]
011B101A  call        _CIsqrt (11B1830h)  
011B101F  call        _ftol2_sse (11B1840h) 

ok1中,我看不到它将2写入pi。可能它明白无论如何该内存位置都会被函数的结果覆盖,所以写入是没有用的。

使用 ok2,编译器就像我预期的那样聪明。它知道 ipi 指向同一个地方,因此它直接使用硬编码的 2

注释

  • 我为这两个测试编译了两次,一次仅取消注释 ok1,一次仅取消注释 ok2。同时编译这两个函数会导致两个函数之间的优化更加复杂,最终导致所有内联和混合。
  • 我在数组 vars 中添加了一个查找,因为对 sqrtl< 的简单调用/code> 被简化为基本的类似 ADD 和 MUL 的操作,无需实际调用
  • 在 Release 中编译
  • 当然产生了预期的结果

Visual Studio 2010 (Express) does, in the simple cases I've tested at least. Anyone to test gcc?

I've tested the following:

1. Passing only i:

int vars[] = {1,2,3,12,3,23,1,213,231,1,21,12,213,21321,213,123213,213123};

int ok1(const int i){
    return sqrtl(vars[i]);
}

int ok2(const int & i){
    return sqrtl(vars[i]);
}

void main() {
    int i;
    std::cin >> i;
    //i = ok1(i);
    i = ok2(i);
    std::cout << i;
}

The ASM:

i = ok1(i);
000D1014  mov         ecx,dword ptr [i]  
000D1017  fild        dword ptr vars (0D3018h)[ecx*4]  
000D101E  call        _CIsqrt (0D1830h)  
000D1023  call        _ftol2_sse (0D1840h) 

i = ok2(i);
013A1014  mov         ecx,dword ptr [i]  
013A1017  fild        dword ptr vars (13A3018h)[ecx*4]  
013A101E  call        _CIsqrt (13A1830h)  
013A1023  call        _ftol2_sse (13A1840h)

Well, the ASMs are identical, no doubt the optimization was performed.

2. Passing i and &i:

Let's consider @newacct 's anser here.

int vars[] = {1,2,3,12,3,23,1,213,231,1,21,12,213,21321,213,123213,213123};

int ok1(const int i, int * pi) {
    *pi = 2;
    return sqrtl(vars[i]);
}

int ok2(const int & i, int * pi) {
    *pi = 2;
    return sqrtl(vars[i]);
}

void main() {
    int i;
    int * pi = &i;
    std::cin >> i;
    i = ok1(i, pi);
    //i = ok2(i, pi);
    std::cout << i;
}

The ASM:

i = ok1(i, pi);
00891014  mov         ecx,dword ptr [i]
00891017  fild        dword ptr vars (893018h)[ecx*4] // access vars[i] 
0089101E  call        _CIsqrt (891830h)  
00891023  call        _ftol2_sse (891840h)  

i = ok2(i, pi);
011B1014  fild        dword ptr [vars+8 (11B3020h)]   // access vars[2]
011B101A  call        _CIsqrt (11B1830h)  
011B101F  call        _ftol2_sse (11B1840h) 

In ok1 I can't see it writing 2 into pi. Probably it understands that the memory location will be overwritten by the result of the function anyway, so the writing is useless.

With ok2, the compiler is as smart-ass as I expected. It understands that i and pi point to the same place, so it uses a hardcoded 2 directly.

Notes:

  • I've compiled twice for both test, once uncommenting only ok1, once uncommenting only ok2. Compiling both at the same time leads to more complex optimizations between the two functions, which end up all inlined and mixed up
  • I've added a lookup in the array vars because simple calls to sqrtl were simplified into basic ADD- and MUL-like operations without the actual call
  • Compiled in Release
  • Yielded the expected results, of course
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文