vararg 函数如何找出机器代码中的参数数量?

发布于 2024-10-21 21:16:37 字数 211 浏览 3 评论 0原文

printf 这样的可变参数函数如何找出它们获得的参数数量?

参数的数量显然不会作为(隐藏)参数传递(请参阅 在此处的 asm 示例中调用 printf)。

有什么窍门呢?

How can variadic functions like printf find out the number of arguments they got?

The amount of arguments obviously isn't passed as a (hidden) parameter (see a call to printf in asm example here).

What's the trick?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

各自安好 2024-10-28 21:16:37

诀窍在于你以其他方式告诉他们。对于 printf,您必须提供一个格式字符串,其中甚至包含类型信息(尽管这可能不正确)。提供此信息的方式主要是用户契约,并且经常容易出错。

至于调用约定:通常将参数从左到右压入堆栈,最后返回跳转地址。调用例程清除堆栈。因此,被调用的例程没有技术需要知道参数的数量。

编辑:在 C++0x 中,有一种安全的方法(甚至类型安全!)来调用可变参数函数!

The trick is that you tell them somehow else. For printf you have to supply a format string which even contains type information (which might be incorrect though). The way to supply this information is mainly user-contract and often error-prone.

As for calling conventions: Usually the arguments are pushed onto the stack from left to right and then the backjump address at last. The calling routine clears the stack. So there is no technical need for the called routine to know the number of parameters.

EDIT: In C++0x there is a safe way (even typesafe!) to call variadic functions!

好久不见√ 2024-10-28 21:16:37

隐式地,来自格式字符串。请注意,stdarg.h 不包含任何宏来检索传递的参数的“可变”总数。这也是 C 调用约定要求调用者清理堆栈的原因之一,尽管这会增加代码大小。

Implicitly, from the format string. Note that stdarg.h doesn't contain any macros to retrieve the total "variable" number of arguments passed. This is also one of the reasons the C calling convention requires the caller to clean the stack, even though this increases code size.

鸩远一方 2024-10-28 21:16:37

这就是为什么在 C 调用约定中参数按相反顺序推送的原因,例如:

如果调用:

printf("%s %s", foo, bar);

堆栈最终如下所示:

  ...
+-------------------+
| bar               |
+-------------------+
| foo               |
+-------------------+
| "%s %s"           |
+-------------------+
| return address    |
+-------------------+
| old frame pointer | <- frame pointer
+-------------------+
  ...

使用距帧指针的偏移量间接访问参数(智能编译器可以省略帧指针)知道如何从堆栈指针计算事物)。在此方案中,第一个参数始终位于众所周知的地址,该函数访问与第一个参数告诉它的参数一样多的参数。

尝试以下操作:

printf("%x %x %x %x %x %x\n");

这将转储部分堆栈。

This is the reason why arguments are pushed on reverse order on the C calling convention, e.g:

If you call:

printf("%s %s", foo, bar);

The stack ends up like:

  ...
+-------------------+
| bar               |
+-------------------+
| foo               |
+-------------------+
| "%s %s"           |
+-------------------+
| return address    |
+-------------------+
| old frame pointer | <- frame pointer
+-------------------+
  ...

Arguments are accesed indirectly using its offset from the frame pointer (the frame pointer can be omitted by smart compilers that know how to calculate things from the stack pointer). The first argument is always at a well-known address in this scheme, the function accesses as many arguments as its first arguments tell it to.

Try the following:

printf("%x %x %x %x %x %x\n");

This will dump part of the stack.

孤者何惧 2024-10-28 21:16:37
  • AMD64 System V ABI(Linux、Mac OS X )确实在 al (RAX 的低字节)中传递数字向量(SSE / AVX)可变参数,这与任何标准 IA-32 调用约定不同。另请参阅:为什么在调用 printf 之前将 %eax 归零?

    但最多只能有 8 个(要使用的寄存器的最大数量)。 IIRC 中,ABI 允许 al 大于 XMM/YMM/ZMM 参数的实际数量,但不能小于。因此,它通常不会总是告诉您 FP 参数的数量;你无法分辨出超过 8 的个数,并且 al 是允许超数的。

    仅出于性能原因,可以跳过将不需要的向量寄存器保存到“3.5.7 变量参数列表”中提到的“寄存器保存区域”。例如,GCC 编写测试 al!=0 的代码,然后将 XMM0..7 转储到堆栈或什么也不转储。 (或者,如果函数在任意位置使用 VA_ARG__m256,则 YMM0..7。)

  • 在 C 级别,除了解析格式字符串之外,还有其他技术正如其他人提到的。您还可以:

    • 传递一个哨兵(void *)0来指示最后一个参数,例如execl 可以。

      您将需要使用 sentinel 函数属性来帮助 GCC 在编译时强制执行:C 警告函数调用中缺少哨兵

    • 将其作为额外的整数参数与可变参数的数量传递

    • 使用 format 函数属性帮助 GCC 强制执行已知类型的格式字符串,例如 printfstrftime

      < /里>

相关:gcc 中如何实现变量参数?

  • The AMD64 System V ABI (Linux, Mac OS X) does pass the number vector (SSE / AVX) varargs in al (the low byte of RAX), unlike any standard IA-32 calling conventions. See also: Why is %eax zeroed before a call to printf?

    But only up to 8 (the max number of registers to use). And IIRC, the ABI allows al to be greater than the actual number of XMM/YMM/ZMM args but it must not be less. So it does not in general always tell you the number of FP args; you can't tell how many more than 8, and al is allowed to overcount.

    It's only usable for performance reasons, to skip saving unneeded vector registers to the "Register Save Area" mentioned in "3.5.7 Variable Argument Lists". For example GCC makes code that tests al!=0 and then dumps XMM0..7 to the stack or nothing. (Or if the function uses VA_ARG with __m256 anywhere, then YMM0..7.)

  • On the C level, there are also other techniques besides parsing the format string as mentioned by others. You could also:

    • pass a sentinel (void *)0 to indicate the last argument like execl does.

      You will want to use the sentinel function attribute to help GCC enforce that at compile time: C warning Missing sentinel in function call

    • pass it as an extra integer argument with the number of varargs

    • use the format function attribute to help GCC enforce format strings of known types like printf or strftime

Related: How are variable arguments implemented in gcc?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文