Visual Studio 和 GCC 中的参数传递
Visual Studio 中的参数传递。请注意 __m128 类型是如何传递的。这是否意味着按值传递的参数不应超过 4 个。
void good_function(__m128, __m128, __m128, __m128, __m128&);
void bad_function(__m128, __m128, __m128, __m128, __m128);
同样的规则也适用于 GCC 吗?
谢谢你!
编辑:bad_function
的第五个参数是否可能未对齐?我在某处读到,寄存器中只传递了 3 个参数(我猜是 Win32,而不是 x64)。
Parameter passing in Visual Studio. Note how __m128
types are passed. Does it mean that no more than 4 __m128
arguments should be passed by value.
void good_function(__m128, __m128, __m128, __m128, __m128&);
void bad_function(__m128, __m128, __m128, __m128, __m128);
Does the same rule apply to GCC?
Thank you!
EDIT: Is it possible that the fifth argument of bad_function
can be misaligned? I read somewhere that only 3 arguments are passed in registers (I guess it is Win32, not x64).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
参数传递由系统 ABI(应用程序二进制接口)的调用约定定义。使用哪个 ABI 取决于您要编译的目标(操作系统+硬件平台)。例如,请参阅 *nix 使用的 AMD64 ABI(粗略地说,有一个我相信有一些小的变体)。
您提供的链接指向有关 Microsoft x64 调用约定参数传递的文章。它指出:
__m128
始终通过指针传递,而不是通过值传递RCX
、RDX
、传递>R8
,R9
因此,给定任何仅具有 __m128 参数的函数,最多四个将作为寄存器中的指针传递,而其他任何函数将作为寄存器中的指针传递堆。正如杰森在另一个答案中指出的那样,这些指针将指向也可能在堆栈上的值。
__m128
和__m128 &
(以及__m128 *
)在 Microsoft x64 调用约定中的成本可能相同 - 它们都通过指针传递。通读 AMD64 ABI,看起来前 8 个 XMM 寄存器(
%xmm0
到%xmm8
)是 128 位宽,并且需要一个__m128
代码> 按值。因此,在使用 AMD64 ABI 的系统(例如 Linux 上的 gcc)上,前八个 __m128 参数将最终存放在寄存器中。在这种情况下,将第 9 个到第 15 个
__m128
参数作为引用/指针传递可能是有意义的 - 它们可以利用整数寄存器。这将避免将它们复制到堆栈中。我不确定 Windows 上的 gcc(例如 mingw)使用哪种约定。如果它与其他库交互,大概它必须使用 Microsoft x64 约定。
不过,如果您很好奇,我强烈建议您执行一些实验并查看反汇编 - gcc 的
-S
选项对此非常有用!如果您使用的是 Visual Studio,则可以使用调试器中的反汇编窗口。即使您没有完全理解正在发生的事情,稍微深入了解一下总是好的。您将开始看到模式,并可以提出问题或研究您所看到的内容。
Argument passing is defined by a calling convention from the system ABI (Application Binary Interface). Which ABI is in use depends on what target (os + hardware platform) you're compiling for. See for example the AMD64 ABI, which *nix uses (roughly, there are a few minor variants I believe).
The link you provided was to an article on the Microsoft x64 calling convention parameter passing. It notes:
__m128
are always passed by pointer, not by valueRCX
,RDX
,R8
,R9
So given any function with only
__m128
arguments, up to four will pass as pointers in registers, and any others will pass as pointers on the stack. As Jason notes in another answer, these pointers will be pointing to values that are also likely on the stack.__m128
and__m128 &
(and also__m128 *
) are likely equivalent in cost in the Microsoft x64 calling convention -- they all pass by pointer.Reading through the AMD64 ABI, it looks as if the first 8 XMM registers (
%xmm0
through%xmm8
) are 128 bits wide and will take an__m128
by value. So on systems that use the AMD64 ABI (e.g. gcc on linux), the first eight__m128
args will end up in registers.In this case, it might make sense to pass the 9th through 15th
__m128
args as references/pointers -- they could make use of the integer registers. This would avoid copying them to stack.I'm not sure which convention gcc on windows (e.g. mingw) uses. Presumably it must use Microsoft x64 convention if it is interacting with other libraries.
If you're curious, though, I'd highly recommend performing a few experiments and looking at the disassembly -- gcc's
-S
option is great for this! If you're in Visual Studio you can use the disassembly window in the debugger.It's always good to be peaking under the hood a bit, even if you're not fully understanding what's going on. You'll start seeing patterns and can ask questions or research what you see.
您链接到的参数传递描述实际上是 x86_64 Windows ABI(应用程序二进制接口)描述,演示如何将值传递给程序集级别的函数(即编译器如何将 C 函数调用转换为程序集)。也就是说,前四个参数将作为指向使用 x86_64 平台上的寄存器的 __m128 类型的指针传递。由于 x86_64 平台与 32 位平台相比有更多的寄存器可供使用,因此这种类型的参数传递是为了加速函数调用,因为访问存储在寄存器中的参数将比访问存储在内存中的参数值更快。堆栈就像您通常在 32 位 x86 平台上看到的 cdecl 样式函数调用一样。如果超过 4 个参数,则指向 __m128 类型的其余指针将存储在堆栈中。因此,仅仅根据参数的数量并不存在“好函数”或“坏函数”。在您的示例中,您的两个函数都很好,只是在第二个示例中,其余参数必须使用堆栈空间,因为可用于传递值的可用寄存器数量已用完。
话虽这么说,如果指向 __m128 类型的指针是自动变量,它们很可能会指向堆栈上分配的地址。因此,无论哪种方式,您很可能会使用堆栈空间,要么存储指向的变量,要么在处理大于 64 位的值和其他聚合类型(如类、联合、 x86_64
至于 GCC,由于您所引用的实际上是依赖于平台的 ABI 实现(在本例中为 x86_64 Windows),因此您在其他平台上看到的内容会有所不同,尽管对于大多数 在操作系统中,它们将使用一系列寄存器来传递函数调用的前几个参数。因此,GCC 实际上将在 x86_64 Windows 上使用与 Visual Studio 相同的规则,但在其他平台上,它将根据这些平台的 x86_64 ABI 创建不同的程序集。
The parameter passing description you linked to is actually a x86_64 Windows ABI (application binary interface) description demonstrating how values will be passed to a function at the assembly level (i.e., how the compiler will translate a C-function call to assembly). That being said, the first four arguments will be passed as pointers to the
__m128
types using registers on a x86_64 platform. Since the x86_64 platform has more registers to work with compared to it's 32-bit couterpart, this type of parameter passing is done in order to speed up functions calls since accessing arguments stored in registers will be faster than accessing argument values stored in memory on the stack like you would normally see with cdecl-style function calls on a 32-bit x86 platform. If you go beyond 4 arguments, then the rest of the pointers to the__m128
type are stored on the stack. So there are no "good function" or "bad functions" simply based on the the number of arguments. In your example, both of your functions are good, it's just with the second example the rest of the arguments must use stack-space, as the number of available registers that can be used to pass values have been used up.That being said, the pointers to the
__m128
types will most likely be pointing to addresses allocated on the stack if they are automatic variables. So either way, you'll most likely be using stack space, either to store the variables being pointed to, or to pass extra arguments to functions when working with larger-than-64-bit values and other aggregate types like classes, unions, arrays, etc.As for GCC, since what you've referenced is actually a platform-dependent ABI implementation (in this case x86_64 Windows), what you'll see on other platforms will differ somewhat, although again, for most x86_64 OS's, they will use a series of registers to pass the first couple arguments of a function call. So GCC will actually use the same rules as Visual Studio on x86_64 Windows, but on other platforms it will create different assembly based on the x86_64 ABI's for those platforms.