C 有标准的 ABI 吗?
From a discussion somewhere else:
C++ has no standard ABI (Application Binary Interface)
But neither does C, right?
On any given platform it pretty much does. It wouldn't be useful as the lingua franca for inter-language communication if it lacked one.
What's your take on this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
C 没有定义 ABI。事实上,它竭尽全力避免定义 ABI。那些像我一样花费了大部分编程时间在 16/32/64 位架构上用 C 语言进行编程的人,这些架构具有 8 位字节、2 的补码算术和平面地址空间,他们在阅读以下复杂的语言时通常会感到非常惊讶当前的C标准。
例如,阅读有关指针的内容。该标准并没有说“指针是地址”这样简单的事情,因为这就是对 ABI 的假设。特别是,它允许指针位于不同的地址空间中并具有不同的宽度。
ABI 是从语言的执行模型到特定机器/操作系统/编译器组合的映射。在语言规范中定义 C 实现是没有意义的,因为这存在排除某些体系结构上的 C 实现的风险。
C defines no ABI. In fact, it bends over backwards to avoid defining an ABI. Those people, who like me, who have spent most of their programming lives programming in C on 16/32/64 bit architectures with 8 bit bytes, 2's complement arithmetic and flat address spaces, will usually be quite surprised on reading the convoluted language of the current C standard.
For example, read the stuff about pointers. The standard doesn't say anything so simple as "a pointer is an address" for that would be making an assumption about the ABI. In particular, it allows for pointers being in different address spaces and having varying width.
An ABI is a mapping from the execution model of the language to a particular machine/operating system/compiler combination. It makes no sense to define one in the language specification because that runs the risk of excluding C implementations on some architectures.
C 原则上没有标准 ABI,但在实践中,这并不重要:您做操作系统供应商所做的事情。
以 x86 Windows 上的调用约定为例: Windows API 使用所谓的“标准”调用约定 (stdcall)。因此,任何想要与操作系统交互的编译器都需要实现它。但是,stdcall 并不支持所有 C90 语言功能(例如,调用没有原型的函数、可变参数函数)。由于 Microsoft 提供了 C 编译器,因此需要第二个调用约定,称为“C”调用约定 (cdecl)。 Windows 上的大多数 C 编译器都使用此作为默认调用约定,因此是可互操作的。
原则上,C++ 也可能发生同样的情况,但由于 C++ ABI(包括调用约定)必然更加复杂,编译器供应商并未就单一 ABI 达成一致,但仍然可以通过回退到 extern 来进行互操作“C”。
C has no standard ABI in principle, but in practice, this rarely matters: You do what your OS-vendor does.
Take the calling conventions on x86 Windows, for example: The Windows API uses the so-called 'standard' calling convention (stdcall). Thus, any compiler which wants to interface with the OS needs to implement it. However, stdcall doesn't support all C90 language features (eg calling functions without prototypes, variadic functions). As Microsoft provided a C compiler, a second calling convention was necessary, called the 'C' calling convention (cdecl). Most C compilers on Windows use this as their default calling convention, and thus are interoperable.
In principle, the same could have happened with C++, but as the C++ ABI (including the calling convention) is necessarily far more elaborate, compiler vendors did not agree on a single ABI, but could still interoperate by falling back to
extern "C"
.C 的 ABI 是特定于平台的 - 它涵盖了寄存器分配和调用约定等问题,这些问题显然是特定于特定处理器的。以下是一些示例:
x86 有很多调用约定,Windows 下的扩展需要声明这些约定使用哪一个。嵌入式 Linux 的平台 ABI 也随着时间的推移而发生变化,导致用户空间不兼容。请参阅ARM Linux 移植的一些历史记录,其中显示了过渡到较新的 ABI 时出现的问题。
The ABI for C is platform specific - it covers issues such as register allocation and calling conventions, which are obviously specific to a particular processor. Here are some examples:
x86 has had many calling conventions, which extensions under Windows to declare which one is used. Platform ABIs for embedded Linux have also changed over time, leading to incompatible user space. See some history of the ARM Linux port here, which shows the problems in the transition to a newer ABI.
引用 ... Linux 系统编程 第 4 页。
Quoting ... Linux System Programming page 4.
ABI,即使对于 C,也具有完全独立于平台的部分,依赖于处理器的部分(应保存哪些寄存器,用于传递参数,...)以及依赖于操作系统的部分(或多或少)与处理器相同的因素,因为某些选择不是由体系结构强加的,而是权衡的结果,加上一些操作系统具有独立于语言的异常概念,因此任何语言的编译器都必须生成正确的东西来处理这些,线程的处理也可能会对 ABI 施加影响——如果寄存器指向 TLS,则不能将其用于您想要的用途)。
理论上,每个编译器都可能有自己的 ABI。但通常,对于几个处理器/操作系统,ABI 由操作系统供应商修复,操作系统供应商通常还提供 C 编译器和使用 ABI 的通用库,并且竞争对手更愿意兼容。 (如果 C 不是主要编程语言的某些操作系统有例外,我不会感到惊讶)。
但操作系统供应商可能会出于某种原因切换 ABI(新版本的处理器可能具有您想要在 ABI 中使用的功能 - 例如,有些处理器要求 x86_64 提供 32 位 ABI,允许使用所有寄存器) 。在迁移阶段(可能会持续很长时间),您可能必须处理两个 ABI。
An ABI, even for C, has parts which are quite platform independent, parts which depend on the processor (which registers should be saved, which are used for passing parameters,...) and parts which depend on the OS (more or less the same factors as for the processor as some choices are not imposed by the architecture but are the result of trade-offs, plus some OS's have a language independent notion of exception and so a compiler for any language has to generate the right thing to handle those, handling of threads may also impose things on the ABI -- if a register points to TLS, you can't use it for what you want).
In theory, every compiler may have its own ABI. But usually, for a couple processor/OS, the ABI is fixed by the OS vendor which often also provide a C compiler and common libraries which use that ABI and competitors prefer to be compatible. (I'd not be surprised if there are exceptions for some OS for which C isn't a major programming language).
But the OS vendor may switch ABI for one reason or the other (new versions of processors may have features that you want to use in the ABI for one - for instance some have asked for a 32bit ABI for x86_64 allowing to use all the registers). During the migration phase - which may be for a very long time - you may have to handle two ABI.
C 也不行,对吗?
对,
在任何给定的平台上它都几乎。如果缺乏的话,它就无法作为跨语言通信的通用语言。
几乎可能指的是 C 编译器供应商选择的特定于体系结构的默认值,并在其他语言中进行了调整。因此,如果 Keil 的 ARM C 编译器将使用从左到右的小端参数排序和堆栈来传递参数和一些预定的寄存器作为返回值,那么来自其他编译器的 extern“C” 将假定与这种方案兼容。
虽然此类协议可能被视为 ABI 的一部分,但与 JVM 浏览器沙箱等托管执行上下文不同,它本身远非完整的标准 ABI。
neither does C, right?
Right
On any given platform it pretty much does. It wouldn't be useful as the lingua franca for inter-language communication if it lacked one.
Pretty much might refer to architecture-specific defaults chosen by C compiler vendors being adapted within other languages. So if Keil's ARM C compiler will use left to right little endian parameter ordering and stack to pass arguments and some predetermined register for return value, then extern "C" from other compilers will assume compatibility with such scheme.
While such agreement maybe considered part of ABI, unlike managed execution context such as JVM browser sandbox, this is far from being complete standard ABI by itself.
在 C89 标准之前,许多平台的 C 编译器基本上使用相同的 ABI,除了数据大小的变化之外。对于堆栈向下增长的机器,调用函数的代码将按从右到左的顺序将参数压入堆栈,然后调用该函数(压入进程中的返回地址)。被调用的函数会将其参数保留在堆栈上,调用者将在闲暇时调整堆栈指针以删除它们[或者,在某些体系结构上,可能会适当调整堆栈值]。虽然
使得大多数程序没有必要依赖该约定,但它仍然使用了很多年,因为它很简单并且工作得很好。虽然没有“官方”文件将其确定为跨平台“标准”,但大多数针对具有向下增长堆栈的机器的编译器都以这种方式工作,从而实现了比当今更高水平的一致性。Prior to the C89 Standard, C compilers for many platforms used essentially the same ABI, save for variations in data sizes. For machines whose stack grows downward, code which calls a function would push the arguments on the stack in order from right to left and then call the function (pushing the return address in the process). A called function would leave its arguments on the stack, and the caller would at its leisure adjust the stack pointer to remove them [or, on some architectures, might adjust the stacked values in place]. While
<stdarg.h>
made it unnecessary for most programs to rely upon that convention, it remained in use for many years because it was simple and worked pretty well. While there was no "official" document establishing that as a cross-platform "standard", most compilers targeting machines with downward-growing stacks worked that way, leading to a greater level of consistency than exists today.没有标准的 ABI,因为 C 始终关注最大运行时性能,而具有最高性能的 ABI 取决于底层硬件。因此,ABI 可以根据任何给定硬件的需要仅使用堆栈或更喜欢寄存器来传递函数调用参数和返回值。
例如,甚至 amd64(又名 x86-64)也有两种调用约定:Microsoft x64 和 System V AMD64 ABI。前者将 4 个第一个参数放入寄存器,其余放入堆栈。后者将 6 个第一个参数放入寄存器,其余放入堆栈。我不知道为什么 Microsoft 为 amd64 硬件创建了不兼容的调用约定。据我所知,微软版本的性能稍差,并且是后来创建的。
有关更多信息,请参阅https://en.wikipedia.org/wiki/X86_calling_conventions
There's no standard ABI because C has always been about maximum runtime performance and the ABI with the highest performance depends on the underlying hardware. As a result, the ABI may use only stack or prefer registers for passing function call arguments and return values as needed for any given hardware.
For example, even amd64 (a.k.a x86-64) has two calling conventions: Microsoft x64 and System V AMD64 ABI. The former puts 4 first arguments to registers and the rest into the stack. The latter puts 6 first arguments to registers and the rest into the stack. I have no idea why Microsoft created non-compatible calling convention for amd64 hardware. For all I know, the Microsoft variant has a slightly worse performance and was created later.
For more information, see https://en.wikipedia.org/wiki/X86_calling_conventions
C 没有标准的 ABI。此处使用的所有调用约定(cdecl、fastcall 和 stdcall)很容易说明这一点。每个都是不同的 ABI。
编辑:虽然 C 标准没有定义 ABI,但我知道的所有平台都有一个标准的可预测 ABI 或平台遵循的一组 ABI。这些通常都有详细记录。
C does not have a standard ABI. This is easily illustrated by all the calling conventions (cdecl, fastcall and stdcall) that are used out there. Each is a different ABI.
EDIT: Although the C standard does not define an ABI, all platforms that I know have a standard predictable ABI or set of ABIs that the platform adheres to. These are often well documented.