什么是对齐内存分配?

发布于 2024-09-28 06:19:10 字数 34 浏览 0 评论 0原文

我还想知道 glibc malloc() 是否这样做。

I also want to know whether glibc malloc() does this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

ヅ她的身影、若隐若现 2024-10-05 06:19:10

假设你有这个结构。

struct S {
    short a;
    int b;
    char c, d;
};

如果没有对齐,它将像这样在内存中布局(假设是 32 位架构):

 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d|  bytes
|       |       |  words

问题是,在某些 CPU 架构上,从内存加载 4 字节整数的指令仅适用于字边界。因此,您的程序必须使用单独的指令来获取 b 的每一半。

但如果内存布局如下:

 0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
|       |       |       |

那么对 b 的访问就变得简单了。 (缺点是由于填充字节而需要更多内存。)

不同的数据类型有不同的对齐要求。通常 char 是 1 字节对齐,short 是 2 字节对齐,而 4 字节类型(int>float,以及 32 位系统上的指针)为 4 字节对齐。

C 标准要求 malloc 返回一个与任何数据类型正确对齐的指针。

x86-64 上的 glibc malloc 返回 16 字节对齐的指针。

Suppose that you have the structure.

struct S {
    short a;
    int b;
    char c, d;
};

Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):

 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d|  bytes
|       |       |  words

The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.

But if the memory was laid out as:

 0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
|       |       |       |

Then access to b becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)

Different data types have different alignment requirements. It's common for char to be 1-byte aligned, short to be 2-byte aligned, and 4-byte types (int, float, and pointers on 32-bit systems) to be 4-byte aligned.

malloc is required by the C standard to return a pointer that's properly aligned for any data type.

glibc malloc on x86-64 returns 16-byte-aligned pointers.

梦中楼上月下 2024-10-05 06:19:10

对齐要求指定可以将哪些地址偏移分配给哪些类型。这完全取决于实现,但通常基于字大小。例如,某些 32 位体系结构要求所有 int 变量都以 4 的倍数开头。在某些架构上,对齐要求是绝对的。在其他系统(例如 x86)上,藐视它们只会带来性能损失。

malloc 需要返回适合任何对齐要求的地址。换句话说,返回的地址可以分配给任何类型的指针。来自 C99 §7.20.3(内存管理功能):

分配时返回的指针
成功是适当对齐的,以便
它可以被分配给一个指向任何
对象的类型,然后用于访问
这样的对象或这样的数组
分配的空间中的对象(直到
空间被显式释放)。

Alignment requirements specify what address offsets can be assigned to what types. This is completely implementation-dependent, but is generally based on word size. For instance, some 32-bit architectures require all int variables start on a multiple of four. On some architectures, alignment requirements are absolute. On others (e.g. x86) flouting them only comes with a performance penalty.

malloc is required to return an address suitable for any alignment requirement. In other words, the returned address can be assigned to a pointer of any type. From C99 §7.20.3 (Memory management functions):

The pointer returned if the allocation
succeeds is suitably aligned so that
it may be assigned to a pointer to any
type of object and then used to access
such an object or an array of such
objects in the space allocated (until
the space is explicitly deallocated).

罪歌 2024-10-05 06:19:10

如果您有特定的内存对齐需求(针对特定的硬件或库),您可以查看非便携式内存分配器,例如 _aligned_malloc()memalign()。这些可以很容易地抽象为“可移植”接口,但不幸的是它们是非标准的。

If you have particular memory alignemnt needs (for particular hardware or libraries), you can check out non-portable memory allocators such as _aligned_malloc() and memalign(). These can easily be abstracted behind a "portable" interface, but are unfortunately non-standard.

柒夜笙歌凉 2024-10-05 06:19:10

malloc() 文档说:

[...] the allocated memory that is suitably aligned for any kind of variable.

对于您在 C/C++ 中所做的大多数事情来说都是如此。然而,正如其他人指出的那样,存在许多特殊情况并且需要特定的调整。例如,Intel 处理器支持 256 位类型:__m256malloc() 肯定不会考虑这一点。

类似地,如果您想为要分页的数据分配内存缓冲区(类似于 mmap() 返回的地址等),那么您可能需要非常大的对齐,这会浪费很多如果 malloc() 要返回始终与此类边界对齐的缓冲区,则需要占用内存。

在 Linux 或其他 Unix 系统下,我建议您使用 posix_memalign() 函数:

int posix_memalign(void **memptr, size_t alignment, size_t size);

这是满足此类需求的最新函数。


附带说明一下,您仍然可以使用 malloc(),只是在这种情况下您需要分配 size +alignment - 1 字节并在返回的指针上进行自己的对齐: (ptr + 对齐 - 1) & -alignment(未经测试,所有类型转换均丢失)。此外,对齐的指针不是您用来调用 free() 的指针。换句话说,您必须存储 malloc() 返回的指针才能正确调用 free()。如上所述,这意味着每个这样的 malloc() 最多会丢失 alignment - 1 字节。相反, posix_memalign() 函数不应丢失超过 sizeof(void*) * 4 - 1 字节,尽管由于您的大小可能是对齐的倍数,所以您只会丢失 sizeof(void*) * 2...除非您只分配此类缓冲区,否则每次都会丢失完整的 alignment 字节。

The malloc() documentation says:

[...] the allocated memory that is suitably aligned for any kind of variable.

Which is true for most everything you do in C/C++. However, as pointed out by others, many special cases exist and require a specific alignment. For example, Intel processors support a 256 bit type: __m256, which is most certainly not taken in account by malloc().

Similarly, if you want to allocate a memory buffer for data that is to be paged (similar to addresses returned by mmap(), etc.) then you need a possibly very large alignment which would waste a lot of memory if malloc() was to return buffers always aligned to such boundaries.

Under Linux or other Unix systems, I suggest you use the posix_memalign() function:

int posix_memalign(void **memptr, size_t alignment, size_t size);

This is the most current function that one wants to use for such needs.


As a side note, you could still use malloc(), only in that case you need to allocate size + alignment - 1 bytes and do your own alignment on the returned pointer: (ptr + alignment - 1) & -alignment (not tested, all casts missing). Also the aligned pointer is not the one you'll use to call free(). In other words, you have to store the pointer that malloc() returned to be able to call free() properly. As mentioned above, this means you lose up to alignment - 1 byte per such malloc(). In contrast, the posix_memalign() function should not lose more than sizeof(void*) * 4 - 1 bytes, although since your size is likely a multiple of alignment, you would only lose sizeof(void*) * 2... unless you only allocate such buffers, then you lose a full alignment bytes each time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文