参数传递如何工作?
我想知道如何将参数传递给 C 中的函数。这些值存储在哪里以及如何检索它们?可变参数传递如何工作?另外,因为它是相关的:返回值怎么样?
我对CPU寄存器和汇编器有基本的了解,但还不足以彻底理解GCC向我吐槽的ASM。一些简单的带注释的例子将不胜感激。
I want to know how passing arguments to functions in C works. Where are the values being stored and how and they retrieved? How does variadic argument passing work? Also since it's related: what about return values?
I have a basic understanding of CPU registers and assembler, but not enough that I thoroughly understand the ASM that GCC spits back at me. Some simple annotated examples would be much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
考虑这段代码:
使用
gcc foo.c -S
编译它会给出汇编输出:所以基本上调用者(在本例中为
main
)首先在堆栈上分配 8 个字节容纳两个参数,然后将两个参数放入堆栈中相应的偏移量(4
和0
),然后call
指令为发出它将控制权转移到 foo 例程。foo
例程从堆栈上相应的偏移量读取其参数,恢复它,并将其返回值放入eax
寄存器中,以便调用者可以使用它。Considering this code:
Compiling it with
gcc foo.c -S
gives the assembly output:So basically the caller (in this case
main
) first allocates 8 bytes on the stack to accomodate the two arguments, then puts the two arguments on the stack at the corresponding offsets (4
and0
), and then thecall
instruction is issued which transfers the control to thefoo
routine. Thefoo
routine reads its arguments from the corresponding offsets at the stack, restores it, and puts its return value in theeax
register so it's available to the caller.这是特定于平台的,也是“ABI”的一部分。事实上,一些编译器甚至允许您在不同的约定之间进行选择。
例如,Microsoft 的 Visual Studio 提供了 __fastcall 调用约定,该约定使用寄存器。其他平台或调用约定专门使用堆栈。
可变参数的工作方式非常相似——它们通过寄存器或堆栈传递。对于寄存器,它们通常根据类型按升序排列。如果你有类似 (int a, int b, float c, int d) 的东西,PowerPC ABI 可能会将
a
放入 r3,b
放入 r4,d
在 r5 中,c
在 fp1 中(我忘记了浮点寄存器从哪里开始,但你明白了)。返回值同样以同样的方式工作。
不幸的是,我没有很多例子,我的大部分程序集都是在 PowerPC 中,你在程序集中看到的只是直接用于 r3、r4、r5 的代码,并将返回值也放在 r3 中。
That is platform specific and part of the "ABI". In fact, some compilers even allow you to choose between different conventions.
Microsoft's Visual Studio, for example, offers the __fastcall calling convention, which uses registers. Other platforms or calling conventions use the stack exclusively.
Variadic arguments work in a very similar way - they are passed via registers or stack. In case of registers, they are usually in ascending order, based on type. If you have something like (int a, int b, float c, int d), a PowerPC ABI might put
a
in r3,b
in r4,d
in r5, andc
in fp1 (I forgot where float registers start, but you get the idea).Return values, again, work the same way.
Unfortunately, I don't have many examples, most of my assembly is in PowerPC, and all you see in the assembly is the code going straight for r3, r4, r5, and placing the return value in r3 as well.
您的问题比任何人在 SO 帖子中合理尝试回答的问题都要多,更不用说它的实现也已定义。
但是,如果您对 x86 答案感兴趣,我建议您观看标题为 编程范式,您提出的问题的所有答案将在前 6-8 个讲座中详细(而且非常雄辩地)解释。
Your questions are more than anybody could reasonably try to answer in a SO post, not to mention that it's implementation defined as well.
However, if you're interested in the x86 answer might I suggest you watch this Stanford CS107 Lecture titled Programming Paradigms where all the answers to the questions you posed will be explained in great detail (and quite eloquently) in the first 6-8 lectures.
这取决于您的编译器、您正在编译的目标体系结构和操作系统,以及您的编译器是否支持更改调用约定的非标准扩展。但也有一些共同点。
C 调用约定通常由操作系统供应商建立,因为他们需要决定系统库使用什么约定。
较新的 CPU(例如 ARM 或 PowerPC)往往具有由 CPU 供应商定义的调用约定,并且在不同操作系统之间兼容。 x86 是一个例外:不同的系统使用不同的调用约定。 16 位 8086 和 32 位 80386 的调用约定曾经比 x86_64 多得多(尽管甚至还没有减少到一个)。 32 位 x86 Windows 程序有时会在同一程序中使用多个调用约定。
一些观察:
STDCALL
,最初为FAR PASCAL
),并且还支持FORTRAN< /code> 和
FASTCALL
约定。所有四个版本均在 16 位操作系统上提供NEAR
和FAR
变体。因此,几乎所有 Windows 程序都在同一程序中至少使用两种不同的约定。对于尾递归调用尤其如此,尾递归调用在每次调用时都具有完全相同的堆栈帧。尾递归调用通常相当于一个循环:更新一些已更改的寄存器,然后跳回入口点。它们不需要创建新的堆栈帧,也不需要有自己的返回地址:您可以简单地更新调用者的堆栈帧并将其返回地址用作尾部调用。即尾递归很容易优化成循环。
FASTCALL
。printf("%d\n", x);
,编译器会将x
、格式字符串和返回地址推送到堆。这保证了第一个参数位于距堆栈指针的已知偏移处,并且
具有其工作所需的信息。PASCAL
约定,并在 Windows 上作为STDCALL
约定保留下来。它不支持可变参数函数。 (https://en.wikibooks.org/wiki/X86_Disassemble/Calling_Conventions)-fomit-frame-pointer
)。您可以让交叉编译器使用不同的调用约定生成代码,并使用
-S -target
(在clang
上)等开关进行比较。It depends on your compiler, the target architecture and OS you’re compiling for, and whether your compiler supports non-standard extensions that change the calling convention. But there are some commonalities.
The C calling convention is usually established by the vendor of the operating system, because they need to decide what convention the system libraries use.
More recent CPUs (such as ARM or PowerPC) tend to have their calling conventions defined by the CPU vendor and compatible across different operating systems. x86 is an exception to this: different systems use different calling conventions. There used to be a lot more calling conventions for the 16-bit 8086 and 32-bit 80386 than there are for x86_64 (although even that is not down to one). 32-bit x86 Windows programs sometimes use multiple calling conventions within the same program.
Some observations:
STDCALL
, originallyFAR PASCAL
) than the “C” calling convention for the same platform, and also supportsFORTRAN
andFASTCALL
conventions. All four come inNEAR
andFAR
variants on 16-bit OSes. Nearly all Windows programs therefore use at least two different conventions in the same program.This is especially true of tail-recursive calls, which have exactly the same stack frame on each invocation. A tail-recursive call is typically equivalent to a loop: update a few registers that changed, then jump back to the entry point. They do not need to create a new stack frame, or have their own return address: you can simply update the caller’s stack frame and use its return address as the tail call’s. i.e. tail-recursion easily optimizes into a loop.
FASTCALL
on MS-DOS and Windows.printf("%d\n", x);
the compiler will pushx
, then the format string, then the return address, onto the stack. This guarantees that the first argument is at a known offset from the stack pointer and<varargs.h>
has the information it needs to work.PASCAL
convention on MS-DOS, and survives as theSTDCALL
convention on Windows. It cannot support variadic functions. (https://en.wikibooks.org/wiki/X86_Disassembly/Calling_Conventions)-fomit-frame-pointer
).You can get cross-compilers to emit code using different calling conventions, and compare them, with switches such as
-S -target
(onclang
).基本上,C 通过将参数压入堆栈来传递参数。对于指针类型,指针被压入堆栈。
关于 C 的一件事是调用者恢复堆栈而不是被调用的函数。这样,参数的数量可以变化,并且被调用的函数不需要提前知道将传递多少个参数。
返回值在 AX 寄存器或其变体中返回。
Basically, C passes arguments by pushing them on the stack. For pointer types, the pointer is pushed on the stack.
One things about C is that the caller restores the stack rather the function being called. This way, the number of arguments can vary and the called function doesn't need to know ahead of time how many arguments will be passed.
Return values are returned in the AX register, or variations thereof.