什么是对齐内存分配?
我还想知道 glibc malloc() 是否这样做。
I also want to know whether glibc malloc() does this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我还想知道 glibc malloc() 是否这样做。
I also want to know whether glibc malloc() does this.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
假设你有这个结构。
如果没有对齐,它将像这样在内存中布局(假设是 32 位架构):
问题是,在某些 CPU 架构上,从内存加载 4 字节整数的指令仅适用于字边界。因此,您的程序必须使用单独的指令来获取
b
的每一半。但如果内存布局如下:
那么对
b
的访问就变得简单了。 (缺点是由于填充字节而需要更多内存。)不同的数据类型有不同的对齐要求。通常
char
是 1 字节对齐,short
是 2 字节对齐,而 4 字节类型(int
、>float
,以及 32 位系统上的指针)为 4 字节对齐。C 标准要求
malloc
返回一个与任何数据类型正确对齐的指针。x86-64 上的 glibc
malloc
返回 16 字节对齐的指针。Suppose that you have the structure.
Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):
The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of
b
with separate instructions.But if the memory was laid out as:
Then access to
b
becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)Different data types have different alignment requirements. It's common for
char
to be 1-byte aligned,short
to be 2-byte aligned, and 4-byte types (int
,float
, and pointers on 32-bit systems) to be 4-byte aligned.malloc
is required by the C standard to return a pointer that's properly aligned for any data type.glibc
malloc
on x86-64 returns 16-byte-aligned pointers.对齐要求指定可以将哪些地址偏移分配给哪些类型。这完全取决于实现,但通常基于字大小。例如,某些 32 位体系结构要求所有
int
变量都以 4 的倍数开头。在某些架构上,对齐要求是绝对的。在其他系统(例如 x86)上,藐视它们只会带来性能损失。malloc
需要返回适合任何对齐要求的地址。换句话说,返回的地址可以分配给任何类型的指针。来自 C99 §7.20.3(内存管理功能):Alignment requirements specify what address offsets can be assigned to what types. This is completely implementation-dependent, but is generally based on word size. For instance, some 32-bit architectures require all
int
variables start on a multiple of four. On some architectures, alignment requirements are absolute. On others (e.g. x86) flouting them only comes with a performance penalty.malloc
is required to return an address suitable for any alignment requirement. In other words, the returned address can be assigned to a pointer of any type. From C99 §7.20.3 (Memory management functions):如果您有特定的内存对齐需求(针对特定的硬件或库),您可以查看非便携式内存分配器,例如
_aligned_malloc()
和memalign()
。这些可以很容易地抽象为“可移植”接口,但不幸的是它们是非标准的。If you have particular memory alignemnt needs (for particular hardware or libraries), you can check out non-portable memory allocators such as
_aligned_malloc()
andmemalign()
. These can easily be abstracted behind a "portable" interface, but are unfortunately non-standard.malloc()
文档说:对于您在 C/C++ 中所做的大多数事情来说都是如此。然而,正如其他人指出的那样,存在许多特殊情况并且需要特定的调整。例如,Intel 处理器支持 256 位类型:
__m256
,malloc()
肯定不会考虑这一点。类似地,如果您想为要分页的数据分配内存缓冲区(类似于
mmap()
返回的地址等),那么您可能需要非常大的对齐,这会浪费很多如果malloc()
要返回始终与此类边界对齐的缓冲区,则需要占用内存。在 Linux 或其他 Unix 系统下,我建议您使用 posix_memalign() 函数:
这是满足此类需求的最新函数。
附带说明一下,您仍然可以使用
malloc()
,只是在这种情况下您需要分配size +alignment - 1
字节并在返回的指针上进行自己的对齐:(ptr + 对齐 - 1) & -alignment
(未经测试,所有类型转换均丢失)。此外,对齐的指针不是您用来调用free()
的指针。换句话说,您必须存储malloc()
返回的指针才能正确调用free()
。如上所述,这意味着每个这样的malloc()
最多会丢失alignment - 1
字节。相反, posix_memalign() 函数不应丢失超过 sizeof(void*) * 4 - 1 字节,尽管由于您的大小可能是对齐的倍数,所以您只会丢失sizeof(void*) * 2
...除非您只分配此类缓冲区,否则每次都会丢失完整的alignment
字节。The
malloc()
documentation says:Which is true for most everything you do in C/C++. However, as pointed out by others, many special cases exist and require a specific alignment. For example, Intel processors support a 256 bit type:
__m256
, which is most certainly not taken in account bymalloc()
.Similarly, if you want to allocate a memory buffer for data that is to be paged (similar to addresses returned by
mmap()
, etc.) then you need a possibly very large alignment which would waste a lot of memory ifmalloc()
was to return buffers always aligned to such boundaries.Under Linux or other Unix systems, I suggest you use the
posix_memalign()
function:This is the most current function that one wants to use for such needs.
As a side note, you could still use
malloc()
, only in that case you need to allocatesize + alignment - 1
bytes and do your own alignment on the returned pointer:(ptr + alignment - 1) & -alignment
(not tested, all casts missing). Also the aligned pointer is not the one you'll use to callfree()
. In other words, you have to store the pointer thatmalloc()
returned to be able to callfree()
properly. As mentioned above, this means you lose up toalignment - 1
byte per suchmalloc()
. In contrast, theposix_memalign()
function should not lose more thansizeof(void*) * 4 - 1
bytes, although since your size is likely a multiple of alignment, you would only losesizeof(void*) * 2
... unless you only allocate such buffers, then you lose a fullalignment
bytes each time.