malloc()/free() 的对齐限制

发布于 2024-07-04 18:19:11 字数 322 浏览 7 评论 0原文

较旧的 K&R(第二版)和我读过的其他 C 语言文本讨论了 malloc()free() 通常还会顺便提及一些有关数据类型对齐限制的内容。 显然,某些计算机硬件架构(CPU、寄存器和内存访问)限制了存储和寻址某些值类型的方式。 例如,可能要求必须从四的倍数地址开始存储 4 字节(long)整数。

主要平台(Intel & AMD、SPARC、Alpha)对内存分配和内存访问施加哪些限制(如果有),或者我可以安全地忽略在特定地址边界上对齐内存分配吗?

Older K&R (2nd ed.) and other C-language texts I have read that discuss the implementation of a dynamic memory allocator in the style of malloc() and free() usually also mention, in passing, something about data type alignment restrictions. Apparently certain computer hardware architectures (CPU, registers, and memory access) restrict how you can store and address certain value types. For example, there may be a requirement that a 4 byte (long) integer must be stored beginning at addresses that are multiples of four.

What restrictions, if any, do major platforms (Intel & AMD, SPARC, Alpha) impose for memory allocation and memory access, or can I safely ignore aligning memory allocations on specific address boundaries?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

辞旧 2024-07-11 18:19:11

正如 Greg 提到的,它在今天仍然很重要(也许在某些方面更重要),编译器通常会根据架构的目标来处理对齐。 在托管环境中,JIT 编译器可以根据运行时架构优化对齐。

您可能会看到更改对齐方式的 pragma 指令(在 C/C++ 中)。 仅当需要非常具体的对齐时才应使用此方法。

// For example, this changes the pack to 2 byte alignment.
#pragma pack(2)

As Greg mentioned it is still important today (perhaps more so in some ways) and compilers usually take care of the alignment based on the target of the architecture. In managed environments, the JIT compiler can optimize the alignment based on the runtime architecture.

You may see pragma directives (in C/C++) that change the alignment. This should only be used when very specific alignment is required.

// For example, this changes the pack to 2 byte alignment.
#pragma pack(2)
黑寡妇 2024-07-11 18:19:11

请注意,即使在 IA-32 和 AMD64 上,某些 SSE 指令/内在函数也需要对齐数据。 如果数据未对齐,这些指令将引发异常,因此至少您不必调试“错误数据”错误。 也有等效的未对齐指令,但正如 Denton 所说,它们速度较慢。

如果您使用 VC++,那么除了 #pragma pack 指令之外,您还可以使用 __declspec(align) 指令进行精确对齐。 VC++ 文档还提到了用于特定对齐要求的 __aligned_malloc 函数。

根据经验,除非您跨编译器/语言移动数据或使用 SSE 指令,否则您可能可以忽略对齐问题。

Note that even on IA-32 and the AMD64, some of the SSE instructions/intrinsics require aligned data. These instructions will throw an exception if the data is unaligned, so at least you won't have to debug "wrong data" bugs. There are equivalent unaligned instructions as well, but like Denton says, they're are slower.

If you're using VC++, then besides the #pragma pack directives, you also have the __declspec(align) directives for precise alignment. VC++ documentation also mentions an __aligned_malloc function for specific alignment requirements.

As a rule of thumb, unless you are moving data across compilers/languages or are using the SSE instructions, you can probably ignore alignment issues.

只等公子 2024-07-11 18:19:11

今天,协调仍然非常重要。 如果您尝试访问奇数边界上的字值,某些处理器(首先想到的是 68k 系列)会抛出异常。 如今,大多数处理器将运行两个内存周期来获取未对齐的字,但这肯定会比对齐的获取慢。 其他一些处理器甚至不会抛出异常,但会从内存中获取不正确值!

如果除了性能之外没有其他原因,明智的做法是尝试遵循处理器的对齐首选项。 通常,您的编译器会处理所有细节,但如果您正在做任何自己布置内存结构的事情,那么值得考虑。

Alignment is still quite important today. Some processors (the 68k family jumps to mind) would throw an exception if you tried to access a word value on an odd boundary. Today, most processors will run two memory cycles to fetch an unaligned word, but this will definitely be slower than an aligned fetch. Some other processors won't even throw an exception, but will fetch an incorrect value from memory!

If for no other reason than performance, it is wise to try to follow your processor's alignment preferences. Usually, your compiler will take care of all the details, but if you're doing anything where you lay out the memory structure yourself, then it's worth considering.

初见 2024-07-11 18:19:11

即使在今天,Sparc、MIPS、Alpha 和大多数其他“经典 RISC”架构也只允许对内存进行对齐访问。 未对齐的访问将导致异常,但某些操作系统将通过使用较小的加载和存储从软件中的所需地址进行复制来处理异常。 应用程序代码不会知道存在问题,只是性能会非常糟糕。

MIPS 有特殊指令(lwl 和 lwr),可用于从未对齐的地址访问 32 位数量。 每当编译器知道该地址可能未对齐时,它就会使用这两个指令序列而不是正常的 lw 指令。

x86 可以毫无例外地处理硬件中未对齐的内存访问,但与对齐访问相比,性能仍然会受到高达 3 倍的影响。

Ulrich Drepper 就此问题和其他内存相关主题撰写了一篇综合论文,每个程序员都应该了解内存知识< /a>. 这是一篇很长的文章,但充满了耐嚼的优点。

Sparc, MIPS, Alpha, and most other "classical RISC" architectures only allow aligned accesses to memory, even today. An unaligned access will cause an exception, but some operating systems will handle the exception by copying from the desired address in software using smaller loads and stores. The application code won't know there was a problem, except that the performance will be very bad.

MIPS has special instructions (lwl and lwr) which can be used to access 32 bit quantities from unaligned addresses. Whenever the compiler can tell that the address is likely unaligned it will use this two instruction sequence instead of a normal lw instruction.

x86 can handle unaligned memory accesses in hardware without an exception, but there is still a performance hit of up to 3X compared to aligned accesses.

Ulrich Drepper wrote a comprehensive paper on this and other memory-related topics, What Every Programmer Should Know About Memory. It is a very long writeup, but filled with chewy goodness.

不一样的天空 2024-07-11 18:19:11

在 C(++) 中布局类或结构时,您仍然需要注意对齐问题。 在这些情况下,编译器将为您做正确的事情,但结构/类的总体大小可能比必要的更浪费

例如:

struct
{ 
    char A;
    int B;
    char C;
    int D;
};

将具有 4 * 4 = 16 字节的大小(假设 x86 上的 Windows),

struct
{ 
    char A;
    char C;
    int B;
    int D;
};

而大小为 4*3 = 12 字节。

这是因为编译器对整数强制执行 4 字节对齐,但对字符仅强制 1 字节对齐。

一般来说,将相同大小(类型)的成员变量打包在一起,以尽量减少浪费的空间。

You still need to be aware of alignment issues when laying out a class or struct in C(++). In these cases the compiler will do the right thing for you, but the overall size of the struct/class may be more wastefull than necessary

For example:

struct
{ 
    char A;
    int B;
    char C;
    int D;
};

Would have a size of 4 * 4 = 16 bytes (assume Windows on x86) whereas

struct
{ 
    char A;
    char C;
    int B;
    int D;
};

Would have a size of 4*3 = 12 bytes.

This is because the compiler enforces a 4 byte alignment for integers, but only 1 byte for chars.

In general pack member variables of the same size (type) together to minimize wasted space.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文