C 中的结构体内存布局

发布于 2024-08-30 19:21:24 字数 257 浏览 11 评论 0原文

我有 C# 背景。我对 C 这样的低级语言来说是个新手。

在 C# 中,struct 的内存默认由编译器布局。编译器可以重新排序数据字段或隐式在字段之间填充附加位。因此,我必须指定一些特殊属性来覆盖此行为以获得精确的布局。

AFAIK,默认情况下,C 不会重新排序或对齐 struct 的内存布局。不过,我听说有一个很难找到的例外。

C 的内存布局行为是什么?什么应该重新排序/对齐,什么不应该?

I have a C# background. I am very much a newbie to a low-level language like C.

In C#, struct's memory is laid out by the compiler by default. The compiler can re-order data fields or pad additional bits between fields implicitly. So, I had to specify some special attribute to override this behavior for exact layout.

AFAIK, C does not reorder or align memory layout of a struct by default. However, I heard there's a little exception that is very hard to find.

What is C's memory layout behavior? What should be re-ordered/aligned and not?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

固执像三岁 2024-09-06 19:21:24

它是特定于实现的,但实际上规则(在没有#pragma pack 等的情况下)是:

  • 结构成员按照声明的顺序存储。 (这是 C99 标准所要求的,如前所述。)
  • 如有必要,可以在结构成员之间添加填充,以确保后者使用正确的对齐方式。
  • 每个基元类型 T 都需要 sizeof(T) 字节的对齐。

因此,给定以下结构:

struct ST
{
   char ch1;
   short s;
   char ch2;
   long long ll;
   int i;
};
  • ch1 位于偏移量 0 处
  • ,插入填充字节以对齐...
  • s 位于偏移量 2
  • ch2 位于偏移量 4,紧接在 s 处
  • 插入 3 个填充字节以对齐之后...
  • ll 在偏移量 8 处
  • i 位于偏移量 16,紧接在 ll
  • 处添加 4 个填充字节 之后end 使得整个结构体是 8 字节的倍数。我在 64 位系统上检查了这一点:32 位系统可能允许结构进行 4 字节对齐。

所以 sizeof(ST) 是 24。

通过重新排列成员以避免填充,可以将其减少到 16 字节:

struct ST
{
   long long ll; // @ 0
   int i;        // @ 8
   short s;      // @ 12
   char ch1;     // @ 14
   char ch2;     // @ 15
};

It's implementation-specific, but in practice the rule (in the absence of #pragma pack or the like) is:

  • Struct members are stored in the order they are declared. (This is required by the C99 standard, as mentioned here earlier.)
  • If necessary, padding is added between struct members, to ensure that the latter one uses the correct alignment.
  • Each primitive type T requires an alignment of sizeof(T) bytes.

So, given the following struct:

struct ST
{
   char ch1;
   short s;
   char ch2;
   long long ll;
   int i;
};
  • ch1 is at offset 0
  • a padding byte is inserted to align...
  • s at offset 2
  • ch2 is at offset 4, immediately after s
  • 3 padding bytes are inserted to align...
  • ll at offset 8
  • i is at offset 16, right after ll
  • 4 padding bytes are added at the end so that the overall struct is a multiple of 8 bytes. I checked this on a 64-bit system: 32-bit systems may allow structs to have 4-byte alignment.

So sizeof(ST) is 24.

It can be reduced to 16 bytes by rearranging the members to avoid padding:

struct ST
{
   long long ll; // @ 0
   int i;        // @ 8
   short s;      // @ 12
   char ch1;     // @ 14
   char ch2;     // @ 15
};
饮惑 2024-09-06 19:21:24

在 C 中,编译器可以为每个基本类型指定某种对齐方式。通常,对齐方式是类型的大小。但这完全是特定于实现的。

引入填充字节,以便每个对象都正确对齐。不允许重新排序。

可能每个远程现代编译器都实现了#pragma pack,它允许控制填充并将其留给程序员以遵守 ABI。 (不过,这完全是非标准的。)

来自 C99 §6.7.2.1:

12 的每个非位域成员
结构或联合对象已对齐
以实现定义的方式
适合其类型。

13 以内
结构体对象,非位域
会员及所在单位
位字段驻留的地址
按它们的顺序增加
被宣布。指向结构体的指针
对象,经过适当转换,指向
它的初始成员(或者如果该成员
是一个位域,那么到单位
它所在的位置),反之亦然。
内可能有未命名的填充
结构对象,但不是它的
开始。

In C, the compiler is allowed to dictate some alignment for every primitive type. Typically the alignment is the size of the type. But it's entirely implementation-specific.

Padding bytes are introduced so every object is properly aligned. Reordering is not allowed.

Possibly every remotely modern compiler implements #pragma pack which allows control over padding and leaves it to the programmer to comply with the ABI. (It is strictly nonstandard, though.)

From C99 §6.7.2.1:

12 Each non-bit-field member of a
structure or union object is aligned
in an implementation- defined manner
appropriate to its type.

13 Within a
structure object, the non-bit-field
members and the units in which
bit-fields reside have addresses that
increase in the order in which they
are declared. A pointer to a structure
object, suitably converted, points to
its initial member (or if that member
is a bit-field, then to the unit in
which it resides), and vice versa.
There may be unnamed padding within a
structure object, but not at its
beginning.

半边脸i 2024-09-06 19:21:24

您可以首先阅读数据结构对齐维基百科文章,以更好地了解数据对齐。

来自维基百科文章

数据对齐意味着将数据放置在等于字大小的某个倍数的内存偏移处,这会由于CPU处理内存的方式而提高系统的性能。为了对齐数据,可能需要在最后一个数据结构的末尾和下一个数据结构的开头之间插入一些无意义的字节,这就是数据结构填充。

来自 GCC 文档的 6.54.8 Structure-Packing Pragmas

为了与微软兼容
Windows编译器,GCC支持一套
的 #pragma 指令改变了
成员的最大对齐
结构(零宽度除外)
位域)、联合和类
随后定义。 n值
下面总是要求是一个小
2 的幂并指定新的
以字节为单位对齐。

  1. #pragma pack(n) 只是设置新的对齐方式。
  2. #pragma pack() 将对齐方式设置为之前的对齐方式
    编译开始时的效果(参见
    还有命令行选项
    -fpack-struct[=] 请参阅代码生成选项)。
  3. #pragma pack(push[,n]) 将当前对齐设置推送到
    内部堆栈,然后可选
    设置新的对齐方式。
  4. #pragma pack(pop) 将对齐设置恢复为保存的设置
    内部堆栈的顶部(和
    删除该堆栈条目)。注意
    #pragma pack([n]) 不影响此内部堆栈;因此它是
    可能有#pragma pack(push)
    后跟多个#pragma pack(n)
    实例并由单个完成
    #pragma pack(pop)

一些目标,例如 i386 和 powerpc,
支持 ms_struct #pragma
布置了文档中的结构
__attribute__ ((ms_struct))

  1. #pragma ms_struct on 打开声明的结构的布局。
  2. #pragma ms_struct off 关闭声明的结构的布局。
  3. #pragma ms_struct reset 返回默认布局。

You can start by reading the data structure alignment wikipedia article to get a better understanding of data alignment.

From the wikipedia article:

Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

From 6.54.8 Structure-Packing Pragmas of the GCC documentation:

For compatibility with Microsoft
Windows compilers, GCC supports a set
of #pragma directives which change the
maximum alignment of members of
structures (other than zero-width
bitfields), unions, and classes
subsequently defined. The n value
below always is required to be a small
power of two and specifies the new
alignment in bytes.

  1. #pragma pack(n) simply sets the new alignment.
  2. #pragma pack() sets the alignment to the one that was in
    effect when compilation started (see
    also command line option
    -fpack-struct[=] see Code Gen Options).
  3. #pragma pack(push[,n]) pushes the current alignment setting on an
    internal stack and then optionally
    sets the new alignment.
  4. #pragma pack(pop) restores the alignment setting to the one saved at
    the top of the internal stack (and
    removes that stack entry). Note that
    #pragma pack([n]) does not influence this internal stack; thus it is
    possible to have #pragma pack(push)
    followed by multiple #pragma pack(n)
    instances and finalized by a single
    #pragma pack(pop).

Some targets, e.g. i386 and powerpc,
support the ms_struct #pragma which
lays out a structure as the documented
__attribute__ ((ms_struct)).

  1. #pragma ms_struct on turns on the layout for structures declared.
  2. #pragma ms_struct off turns off the layout for structures declared.
  3. #pragma ms_struct reset goes back to the default layout.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文