C 编译器可以在指针为 32 位的情况下生成 64 位可执行文件吗?
大多数程序非常适合 <4GB 地址空间,但需要使用 x64 架构上才可用的新功能。
是否有编译器/平台可以使用 x64 寄存器和特定指令,但保留 32 位指针以节省内存?
是否可以在遗留代码上透明地做到这一点?什么开关可以做到这一点?
或者
为了在保留 32 位指针的同时获得 64 位功能,需要对代码进行哪些更改?
Most programs fits well on <4GB address space but needs to use new features just available on x64 architecture.
Are there compilers/platforms where I can use x64 registers and specific instructions but preserving 32-bits pointers to save memory?
Is it possible do that transparently on legacy code? What switch to do that?
OR
What changes on code is it necessary to get 64-bits features while keep 32-bits pointers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
避免这种情况的一个简单方法是,如果您所指向的结构只有几种类型。然后,您可以为数据分配大数组并使用 uint32_t 进行索引。
因此,这种模型中的“指针”只是全局数组中的索引。通常,使用像样的编译器进行寻址应该足够有效,并且可以节省一些空间。您可能会失去其他您可能感兴趣的东西,例如动态分配。
实现类似功能的另一种方法是对指针进行编码,使其与实际位置存在差异。如果您可以确保该差异始终适合 32 位,您也可以获得收益。
A simple way to circumvent this is if you'd have only few types for your structures that you are pointing to. Then you could just allocate big arrays for your data and do the indexing with
uint32_t
.So a "pointer" in such a model would be just an index in a global array. Usually addressing with that should be efficient enough with a decent compiler, and it would save you some space. You'd loose other things that you might be interested in, dynamic allocation for instance.
Another way to achieve something similar is to encode a pointer with the difference to its actual location. If you can ensure that that difference always fits into 32 bit, you could gain too.
值得注意的是,有一个针对 Linux X32 的 ABI 正在开发中,它允许您构建使用 32 位索引和地址的 x86_64 二进制文件。
只是相对较新,但仍然很有趣。
http://en.wikipedia.org/wiki/X32_ABI
It's worth noting that there an ABI in development for linux, X32, that lets you build a x86_64 binary that uses 32 bit indices and addresses.
Only relatively new, but interesting nonetheless.
http://en.wikipedia.org/wiki/X32_ABI
从技术上讲,编译器可以这样做。 AFAIK,实际上它还没有完成。它已被提议用于 gcc(即使这里有补丁:http:// gcc.gnu.org/ml/gcc/2007-10/msg00156.html),但从未集成(至少,我上次检查时没有记录)。我的理解是,它还需要内核和标准库的支持才能工作(即内核需要以当前不可能的方式进行设置,并且使用现有的 32 或 64 位 ABI 与内核进行通信是不可能的)。
Technically, it is possible for a compiler to do so. AFAIK, in practice it isn't done. It has been proposed for gcc (even with a patch here: http://gcc.gnu.org/ml/gcc/2007-10/msg00156.html) but never integrated (at least, it was not documented the last time I checked). My understanding is that it needs also support from the kernel and standard library to work (i.e. the kernel would need to set up things in a way not currently possible and using the existing 32 or 64 bit ABI to communicate with the kernel would not be possible).
您需要的“64 位功能”到底是什么,是不是有点模糊?
在寻找答案时发现了这个:
http://www.codeproject.com/KB/cpp/smallptr.aspx
还可以选择底部的讨论...
从来没有必要考虑这一点,但有趣的是意识到人们可以关心需要多少空间指针...
What exactly are the "64-bit features" you need, isn't that a little vague?
Found this while searching myself for an answer:
http://www.codeproject.com/KB/cpp/smallptr.aspx
Also pick up the discussion at the bottom...
Never had any need to think about this, but it is interesting to realize that one can be concerned with how much space pointers need...
这取决于平台。在 Mac OS X 上,64 位进程地址空间的前 4 GB 被保留且未映射,大概是作为安全功能,因此 不会 32 位值被错误 作为指针。如果你尝试,也许有办法克服这个问题。我曾经通过编写一个 C++“指针”类来解决这个问题,该类将 0x100000000 添加到存储值中。 (这比对数组进行索引要快得多,后者还需要在加法之前找到数组基地址并进行乘法。)
在 ISA 级别上,您当然可以选择加载并零扩展 32 位值,然后使用它作为 64 位指针。对于平台来说,这是一个很好的功能。
除非您希望同时使用 64 位和 32 位指针,否则无需对程序进行任何更改。在这种情况下,您又回到了拥有
near
和far
指针的糟糕日子。此外,您肯定会破坏 ABI 与采用指针指向指针的 API 的兼容性。
It depends on the platform. On Mac OS X, the first 4 GB of a 64-bit process' address space is reserved and unmapped, presumably as a safety feature so no 32-bit value is ever mistaken for a pointer. If you try, there may be a way to defeat this. I worked around it once by writing a C++ "pointer" class which adds 0x100000000 to the stored value. (This was significantly faster than indexing into an array, which also requires finding the array-base address and multiplying before the addition.)
On the ISA level, you can certainly choose to load and zero-extend a 32-bit value and then use it as a 64-bit pointer. It's a good feature for a platform to have.
No change should be necessary to a program unless you wish to use 64-bit and 32-bit pointers simultaneously. In that case you are back to the bad old days of having
near
andfar
pointers.Also, you will certainly break ABI compatibility with APIs that take pointers to pointers.
我认为这与 MIPS n32 ABI 类似:带有 32 位指针的 64 位寄存器。
在 n32 ABI 中,所有寄存器都是 64 位的(因此需要 MIPS64 处理器)。但地址和指针仅为 32 位(存储在内存中时),从而减少了内存占用。将 32 位值(例如指针)加载到寄存器时,它会被符号扩展为 64 位。当处理器使用指针/地址进行加载或存储时,将使用所有 64 位(处理器不知道 SW 的 n32-ess)。如果您的操作系统支持 n32 程序(也许操作系统也遵循 n32 模型,或者它可能是添加了 n32 支持的适当 64 位操作系统),它可以将 n32 应用程序使用的所有内存定位在合适的内存中(例如较低的 2GB 和较高的 2GB,虚拟地址)。该模型的唯一问题是,当寄存器保存在堆栈上(函数调用等)时,将使用所有 64 位,n32 ABI 中没有 32 位数据模型。
也许这样的 ABI 也可以针对 x86-64 实现。
I think this would be similar to the MIPS n32 ABI: 64-bit registers with 32-bit pointers.
In the n32 ABI, all registers are 64-bit (so requires a MIPS64 processor). But addresses and pointers are only 32-bit (when stored in memory), decreasing the memory footprint. When loading a 32-bit value (such as a pointer) into a register, it is sign-extended into 64-bits. When the processor uses the pointer/address for a load or store, all 64-bits are used (the processor is not aware of the n32-ess of the SW). If your OS supports n32 programs (maybe the OS also follows the n32 model or it may be a proper 64-bit OS with added n32 support), it can locate all memory used by the n32 application in suitable memory (e.g. the lower 2GB and the higher 2GB, virtual addresses). The only glitch with this model is that when registers are saved on the stack (function calls etc), all 64-bits are used, there is no 32-bit data model in the n32 ABI.
Probably such an ABI could be implemented for x86-64 as well.
在 x86 上,没有。在其他处理器上,例如 PowerPC,这很常见 - 64 位寄存器和指令在 32 位模式下可用,而对于 x86,它往往是“全有或全无”。
On x86, no. On other processors, such as PowerPC it is quite common - 64 bit registers and instructions are available in 32 bit mode, whereas with x86 it tends to be "all or nothing".
恐怕如果您担心指针的大小,您可能会遇到更大的问题需要处理。如果指针的数量达到数百万或数十亿,那么在实际耗尽物理或虚拟内存之前,您可能会遇到 Windows 操作系统中的限制。
Mark Russinovich 写了一篇与此相关的精彩文章,名为 Pushing Windows 的限制:虚拟内存。
I'm afraid that if you are concerned about the size of pointers you might have bigger problems to deal with. If the number of pointers is going to be in the millions or billions, you will probably run into limitations within the Windows OS before you actually run out of physical or virtual memory.
Mark Russinovich has written a great article relating to this, named Pushing the Limits of Windows: Virtual Memory.
Linux 现在对 X32 ABI 有了相当全面的支持,它完全满足了提问者的要求,事实上它作为 Gentoo 操作系统下的配置得到了部分支持。我认为这个问题需要结合当前的发展来审视。
Linux now has fairly comprehensive support for the X32 ABI which does exactly what the asker is asking, in fact it is partially supported as a configuration under the Gentoo operating system. I think this question needs to be reviewed in light of resent development.
你问题的第二部分很容易回答。事实上,许多 C 实现都支持使用 32 位代码进行 64 位操作。通常用于此目的的 C 类型是
long long
(但请检查您的编译器和体系结构)。据我所知,64 位本机代码中不可能有 32 位指针。
The second part of your question is easily answered. It is very possible, in fact many C implementations have support, for 64-bit operations using 32-bit code. The C type often used for this is
long long
(but check with your compiler and architecture).As far as I know it is not possible to have 32-bit pointers in 64-bit native code.