malloc如何理解对齐?

发布于 2024-12-24 21:50:34 字数 552 浏览 1 评论 0原文

以下内容摘自此处

pw = (widget *)malloc(sizeof(widget));

分配原始存储。事实上,malloc 调用分配存储空间 足够大并且适当对齐以容纳某种类型的物体 小部件

还可以看到来自 Herb sutter 的 fast pImpl,他说:

对齐。任何内存对齐。任何分配的内存 动态通过 new 或 malloc 保证正确对齐 任何类型的对象,但缓冲区不是动态分配的 没有这样的保证

我对此很好奇,malloc 如何知道自定义类型的对齐方式?

following excerpted from here

pw = (widget *)malloc(sizeof(widget));

allocates raw storage. Indeed, the malloc call allocates storage
that's big enough and suitably aligned to hold an object of type
widget

also see fast pImpl from herb sutter, he said:

Alignment. Any memory Alignment. Any memory that's allocated
dynamically via new or malloc is guaranteed to be properly aligned for
objects of any type, but buffers that are not allocated dynamically
have no such guarantee

I am curious about this, how does malloc know alignment of the custom type?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

千寻… 2024-12-31 21:50:34

对齐要求是递归的:任何struct的对齐只是其任何成员的最大对齐,并且这是递归地理解的。

例如,假设每个基本类型的对齐方式等于其大小(通常情况下并不总是如此),则 struct X { int;字符;双倍的; } 具有 double 的对齐方式,并且它将被填充为 double 大小的倍数(例如 4 (int)、1 (char)、3 (padding)、8 (双倍的))。结构 Y { int; X;漂浮; } 具有 X 的对齐方式,它是最大的并且等于 double 的对齐方式,并且 Y 相应地布局:4(整数)、4(填充)、16(X)、4(浮点)、4(填充)。

(所有数字只是示例,在您的机器上可能会有所不同。)

因此,通过将其分解为基本类型,我们只需要知道少数基本对齐方式,其中有一个众所周知的最大对齐方式。从 C11 开始,类型 max_align_t 具有最大对齐方式

malloc() 所需要做的就是选择一个该值的倍数的地址。

Alignment requirements are recursive: The alignment of any struct is simply the largest alignment of any of its members, and this is understood recursively.

For example, and assuming that each fundamental type's alignment equals its size (this is not always true in general), the struct X { int; char; double; } has the alignment of double, and it will be padded to be a multiple of the size of double (e.g. 4 (int), 1 (char), 3 (padding), 8 (double)). The struct Y { int; X; float; } has the alignment of X, which is the largest and equal to the alignment of double, and Y is laid out accordingly: 4 (int), 4 (padding), 16 (X), 4 (float), 4 (padding).

(All numbers are just examples and could differ on your machine.)

Therefore, by breaking it down to the fundamental types, we only need to know a handful of fundamental alignments, and among those there is a well-known largest. Since C11 the type max_align_t has an alignment which is that largest alignment.

All malloc() needs to do is to pick an address that's a multiple of that value.

无尽的现实 2024-12-31 21:50:34

我认为赫伯·萨特引言中最相关的部分是我用粗体标记的部分:

对齐。任何内存对齐。通过 new 或 malloc 动态分配的任何内存都保证与任何类型的对象正确对齐,但未动态分配的缓冲区则没有这样的保证

它不必知道您的类型是什么请记住,因为它针对任何类型进行对齐。在任何给定的系统上,都有一个必要或有意义的最大对齐大小;例如,具有四字节字的系统可能最多具有四字节对齐。

malloc(3) 也明确了这一点手册页,其中部分内容如下:

malloc()calloc() 函数返回一个指向已分配内存的指针,该指针适合任何类型的变量

I think the most relevant part of the Herb Sutter quote is the part I've marked in bold:

Alignment. Any memory Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee

It doesn't have to know what type you have in mind, because it's aligning for any type. On any given system, there's a maximum alignment size that's ever necessary or meaningful; for example, a system with four-byte words will likely have a maximum of four-byte alignment.

This is also made clear by the malloc(3) man-page, which says in part:

The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable.

锦爱 2024-12-31 21:50:34

malloc() 可以使用的唯一信息是传递给它的请求的大小。一般来说,它可能会执行诸如将传递的大小四舍五入到最接近的更大(或等于)2 次方之类的操作,并根据该值对齐内存。对齐值也可能有上限,例如 8 字节。

以上是假设的讨论,实际实现取决于您所使用的机器架构和运行时库。也许您的 malloc() 总是返回按 8 字节对齐的块,并且不需要做任何不同的事情。

The only information that malloc() can use is the size of the request passed to it. In general, it might do something like round up the passed size to the nearest greater (or equal) power of two, and align the memory based on that value. There would likely also be an upper bound on the alignment value, such as 8 bytes.

The above is a hypothetical discussion, and the actual implementation depends on the machine architecture and runtime library that you're using. Maybe your malloc() always returns blocks aligned on 8 bytes and it never has to do anything different.

撩发小公举 2024-12-31 21:50:34

1) 比对到所有比对的最小公倍数。例如,如果 int 需要 4 字节对齐,但指针需要 8 字节对齐,则将所有内容分配给 8 字节对齐。这使得一切都对齐。

2) 使用大小参数来确定正确的对齐方式。对于小尺寸,您可以推断类型,例如 malloc(1) (假设其他类型尺寸不是 1)始终是 char。 C++ new 具有类型安全的优点,因此始终可以通过这种方式做出对齐决策。

1) Align to the least common multiple of all alignments. e.g. if ints require 4 byte alignment, but pointers require 8, then allocate everything to 8 byte alignment. This causes everything to be aligned.

2) Use the size argument to determine correct alignment. For small sizes you can infer the type, such as malloc(1) (assuming other types sizes are not 1) is always a char. C++ new has the benefit of being type safe and so can always make alignment decisions this way.

帅气称霸 2024-12-31 21:50:34

在 C++11 之前,通过使用最大对齐来处理对齐,其中确切值未知,并且 malloc/calloc 仍然以这种方式工作。这意味着 malloc 分配对于任何类型都是正确对齐的。

根据标准,错误的对齐可能会导致未定义的行为,但我见过 x86 编译器很慷慨,只会以较低的性能来惩罚。

请注意,您还可以通过编译器选项或指令调整对齐方式。 (例如 VisualStudio 的编译指示包)。

但是当谈到placement new时,C++11为我们带来了新的关键字alignofalignas。下面是一些代码,显示了如果编译器最大对齐大于 1,则效果。下面的第一个 placement new 自动是好的,但第二个不是。

#include <iostream>
#include <malloc.h>
using namespace std;
int main()
{
        struct A { char c; };
        struct B { int i; char c; };

        unsigned char * buffer = (unsigned char *)malloc(1000000);
        long mp = (long)buffer;

        // First placment new
        long alignofA = alignof(A) - 1;
        cout << "alignment of A: " << std::hex << (alignofA + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofA)
        {
            mp |= alignofA;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex <<mp << endl;
        A * a = new((unsigned char *)mp)A;
        mp += sizeof(A);

        // Second placment new
        long alignofB = alignof(B) - 1;
        cout << "alignment of B: " <<  std::hex << (alignofB + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofB)
        {
            mp |= alignofB;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex << mp << endl;
        B * b = new((unsigned char *)mp)B;
        mp += sizeof(B);
}

我想通过一些按位运算可以提高这段代码的性能。

编辑:用按位运算替换昂贵的模计算。仍然希望有人更快地找到一些东西。

Previous to C++11 alignment was treated fairly simple by using the largest alignment where exact value was unknown and malloc/calloc still work this way. This means malloc allocation is correctly aligned for any type.

Wrong alignment may result in undefined behavior according to the standard but I have seen x86 compilers being generous and only punishing with lower performance.

Note that you also can tweak alignment via compiler options or directives. (pragma pack for VisualStudio for example).

But when it comes to placement new, then C++11 brings us new keywords called alignof and alignas. Here is some code which shows the effect if compiler max alignment is greater then 1. The first placement new below is automatically good but not the second.

#include <iostream>
#include <malloc.h>
using namespace std;
int main()
{
        struct A { char c; };
        struct B { int i; char c; };

        unsigned char * buffer = (unsigned char *)malloc(1000000);
        long mp = (long)buffer;

        // First placment new
        long alignofA = alignof(A) - 1;
        cout << "alignment of A: " << std::hex << (alignofA + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofA)
        {
            mp |= alignofA;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex <<mp << endl;
        A * a = new((unsigned char *)mp)A;
        mp += sizeof(A);

        // Second placment new
        long alignofB = alignof(B) - 1;
        cout << "alignment of B: " <<  std::hex << (alignofB + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofB)
        {
            mp |= alignofB;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex << mp << endl;
        B * b = new((unsigned char *)mp)B;
        mp += sizeof(B);
}

I guess performance of this code can be improved with some bitwise operations.

EDIT: Replaced expensive modulo computation with bitwise operations. Still hoping that somebody finds something even faster.

空‖城人不在 2024-12-31 21:50:34

malloc 不知道它分配了什么,因为它的参数只是总大小。
它只是对齐到对任何对象都安全的对齐方式。

malloc has no knowledge of what it is allocating for because its parameter is just total size.
It just aligns to an alignment that is safe for any object.

⊕婉儿 2024-12-31 21:50:34

您可能会使用以下小型 C 程序找到 malloc() 实现的分配位:

#include <stdlib.h>
#include <stdio.h>

int main()
{
    size_t
        find = 0,
        size;
    for( unsigned i = 1000000; i--; )
        if( size = rand() & 127 )
            find |= (size_t)malloc( size );
    char bits = 0;
    for( ; !(find & 1); find >>= 1, ++bits );
    printf( "%d", (int)bits );
}

You might find out the allocation bits for your malloc()-implementation with this small C-program:

#include <stdlib.h>
#include <stdio.h>

int main()
{
    size_t
        find = 0,
        size;
    for( unsigned i = 1000000; i--; )
        if( size = rand() & 127 )
            find |= (size_t)malloc( size );
    char bits = 0;
    for( ; !(find & 1); find >>= 1, ++bits );
    printf( "%d", (int)bits );
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文