C 编译器如何实现返回大型结构的函数?

发布于 08-19 07:52 字数 267 浏览 17 评论 0原文



struct Data {
    unsigned values[256];

Data createData() 
    Data data;
    // initialize data values...
    return data;


The return value of a function is usually stored on the stack or in a register. But for a large structure, it has to be on the stack. How much copying has to happen in a real compiler for this code? Or is it optimized away?

For example:

struct Data {
    unsigned values[256];

Data createData() 
    Data data;
    // initialize data values...
    return data;

(Assuming the function cannot be inlined..)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。



需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。


绻影浮沉2024-08-26 07:52:39


调用者的 Data 返回值的地址实际上作为隐藏参数传递给函数,并且 createData 函数只是写入调用者的堆栈帧。

这称为命名返回值优化。另请参阅有关此主题的 c++ 常见问题解答

商业级 C++ 编译器实现按值返回的方式可以消除开销,至少在简单的情况下是如此


当 yourCode() 调用 rbv() 时,编译器会秘密传递一个指针,指向 rbv() 应该构造“返回”对象的位置。

您可以通过向结构中添加带有 printf 的析构函数来证明这是已完成的。如果按值返回优化正在运行,则析构函数只能调用一次,否则调用两次。


Data createData() 
    Data data;
    // initialize data values...
    data.values[5] = 6;
    return data;


        pushl   %ebp
        movl    %esp, %ebp
        subl    $1032, %esp
        movl    8(%ebp), %eax
        movl    $6, 20(%eax)
        ret     $4

奇怪的是,它在堆栈上为数据项 subl $1032, %esp 分配了足够的空间,但请注意,它采用第一个参数放在堆栈上 8(%ebp) 作为对象的基地址,然后初始化该项目的元素 6。由于我们没有为 createData 指定任何参数,因此这很奇怪,直到您意识到这是指向父级数据版本的秘密隐藏指针。

None; no copies are done.

The address of the caller's Data return value is actually passed as a hidden argument to the function, and the createData function simply writes into the caller's stack frame.

This is known as the named return value optimisation. Also see the c++ faq on this topic.

commercial-grade C++ compilers implement return-by-value in a way that lets them eliminate the overhead, at least in simple cases


When yourCode() calls rbv(), the compiler secretly passes a pointer to the location where rbv() is supposed to construct the "returned" object.

You can demonstrate that this has been done by adding a destructor with a printf to your struct. The destructor should only be called once if this return-by-value optimisation is in operation, otherwise twice.

Also you can check the assembly to see that this happens:

Data createData() 
    Data data;
    // initialize data values...
    data.values[5] = 6;
    return data;

here's the assembly:

        pushl   %ebp
        movl    %esp, %ebp
        subl    $1032, %esp
        movl    8(%ebp), %eax
        movl    $6, 20(%eax)
        ret     $4

Curiously, it allocated enough space on the stack for the data item subl $1032, %esp, but note that it takes the first argument on the stack 8(%ebp) as the base address of the object, and then initialises element 6 of that item. Since we didn't specify any arguments to createData, this is curious until you realise this is the secret hidden pointer to the parent's version of Data.

中二柚2024-08-26 07:52:39



C 没有指定从函数返回多大的结构。

以下是针对某个特定编译器的一些测试,x86 RHEL 5.4

gcc 上的 gcc 4.1.2 简单情况,不复制

[00:05:21 1 ~] $ gcc -O2 -S -c t.c
[00:05:23 1 ~] $ cat t.s
        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movl    $1, 24(%eax)
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

gcc 更现实的情况,在堆栈上分配,memcpy 到调用者

#include <stdlib.h>
struct Data {
    unsigned values[256];
struct Data createData()
    struct Data data;
    int i;
    for(i = 0; i < 256 ; i++)
        data.values[i] = rand();
    return data;

[00:06:08 1 ~] $ gcc -O2 -S -c t.c
[00:06:10 1 ~] $ cat t.s
        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    $1, %ebx
        subl    $1036, %esp
        movl    8(%ebp), %edi
        leal    -1036(%ebp), %esi
        .p2align 4,,7
        call    rand
        movl    %eax, -4(%esi,%ebx,4)
        addl    $1, %ebx
        cmpl    $257, %ebx
        jne     .L2
        movl    %esi, 4(%esp)
        movl    %edi, (%esp)
        movl    $1024, 8(%esp)
        call    memcpy
        addl    $1036, %esp
        movl    %edi, %eax
        popl    %ebx
        popl    %esi
        popl    %edi
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

gcc 4.4.2### 增长了很多,并且不复制对于上述非平凡的情况。

        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    $1, %ebx
        subl    $1036, %esp
        movl    8(%ebp), %edi
        leal    -1036(%ebp), %esi
        .p2align 4,,7
        call    rand
        movl    %eax, -4(%esi,%ebx,4)
        addl    $1, %ebx
        cmpl    $257, %ebx
        jne     .L2
        movl    %esi, 4(%esp)
        movl    %edi, (%esp)
        movl    $1024, 8(%esp)
        call    memcpy
        addl    $1036, %esp
        movl    %edi, %eax
        popl    %ebx
        popl    %esi
        popl    %edi
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

另外,VS2008(上面编译为C)将在createData()的堆栈上保留struct Data,并执行rep movsd循环将其在调试模式下复制回调用者,在发布模式下它将 rand() (%eax) 的返回值直接移回调用者

There are many examples given, but basically

This question does not have any definite answer. it will depend on the compiler.

C does not specify how large structs are returned from a function.

Here's some tests for one particular compiler, gcc 4.1.2 on x86 RHEL 5.4

gcc trivial case, no copying

[00:05:21 1 ~] $ gcc -O2 -S -c t.c
[00:05:23 1 ~] $ cat t.s
        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movl    $1, 24(%eax)
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

gcc more realistic case , allocate on stack, memcpy to caller

#include <stdlib.h>
struct Data {
    unsigned values[256];
struct Data createData()
    struct Data data;
    int i;
    for(i = 0; i < 256 ; i++)
        data.values[i] = rand();
    return data;

[00:06:08 1 ~] $ gcc -O2 -S -c t.c
[00:06:10 1 ~] $ cat t.s
        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    $1, %ebx
        subl    $1036, %esp
        movl    8(%ebp), %edi
        leal    -1036(%ebp), %esi
        .p2align 4,,7
        call    rand
        movl    %eax, -4(%esi,%ebx,4)
        addl    $1, %ebx
        cmpl    $257, %ebx
        jne     .L2
        movl    %esi, 4(%esp)
        movl    %edi, (%esp)
        movl    $1024, 8(%esp)
        call    memcpy
        addl    $1036, %esp
        movl    %edi, %eax
        popl    %ebx
        popl    %esi
        popl    %edi
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

gcc 4.4.2### has grown a lot, and does not copy for the above non-trivial case.

        .file   "t.c"
        .p2align 4,,15
.globl createData
        .type   createData, @function
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    $1, %ebx
        subl    $1036, %esp
        movl    8(%ebp), %edi
        leal    -1036(%ebp), %esi
        .p2align 4,,7
        call    rand
        movl    %eax, -4(%esi,%ebx,4)
        addl    $1, %ebx
        cmpl    $257, %ebx
        jne     .L2
        movl    %esi, 4(%esp)
        movl    %edi, (%esp)
        movl    $1024, 8(%esp)
        call    memcpy
        addl    $1036, %esp
        movl    %edi, %eax
        popl    %ebx
        popl    %esi
        popl    %edi
        popl    %ebp
        ret     $4
        .size   createData, .-createData
        .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-46)"
        .section        .note.GNU-stack,"",@progbits

In addition, VS2008 (compiled the above as C) will reserve struct Data on the stack of createData() and do a rep movsd loop to copy it back to the caller in Debug mode, in Release mode it will move the return value of rand() (%eax) directly back to the caller

失退2024-08-26 07:52:39




  • 大多数调用约定通过传递一个附加参数来处理“函数返回结构”,该参数指向调用者堆栈帧中应放置该结构的位置。这绝对是调用约定的问题,而不是语言的问题。

  • 通过这种调用约定,即使是相对简单的编译器也可以注意到代码路径何时肯定会返回结构,并修复对该结构成员的赋值,以便它们直接进入调用者的框架并且不必复制。关键是编译器要注意通过函数的所有终止代码路径都返回相同结构变量。如果是这种情况,编译器可以安全地使用调用者框架中的空间,从而无需在返回点进行复制。

But for a large structure, it has to be on the heap stack.

Indeed so! A large structure declared as a local variable is allocated on the stack. Glad to have that cleared up.

As for avoiding copying, as others have noted:

  • Most calling conventions deal with "function returning struct" by passing an additional parameter that points the location in the caller's stack frame in which the struct should be placed. This is definitely a matter for the calling convention and not the language.

  • With this calling convention, it becomes possible for even a relatively simple compiler to notice when a code path is definitely going to return a struct, and for it to fix assignments to that struct's members so that they go directly into the caller's frame and don't have to be copied. The key is for the compiler to notice that all terminating code paths through the function return the same struct variable. If that's the case, the compiler can safely use the space in the caller's frame, eliminating the need for a copy at the point of return.

不羁少年2024-08-26 07:52:39
typedef struct {
    unsigned value[256];
} Data;

Data createData(void) {
    Data r;
    return r;

Data d = createData();

msvc(6,8,9) 和 gcc mingw(3.4.5,4.4.0) 将生成类似以下伪代码的代码

void createData(Data* r) {
Data d;
typedef struct {
    unsigned value[256];
} Data;

Data createData(void) {
    Data r;
    return r;

Data d = createData();

msvc(6,8,9) and gcc mingw(3.4.5,4.4.0) will generate code like the following pseudocode

void createData(Data* r) {
Data d;
情释2024-08-26 07:52:39

Linux 上的 gcc 将发出 memcpy() 将结构复制回调用者的堆栈上。如果该函数具有内部链接,则可以进行更多优化。

gcc on linux will issue a memcpy() to copy the struct back on the stack of the caller. If the function has internal linkage, more optimizations become available though.

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。