C 结构中的内存对齐
我在 32 位机器上工作,所以我认为内存对齐应该是 4 个字节。假设我有这个结构:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
} myStruct;
普通添加的大小是 6 个字节,我认为对齐的大小应该是 8,但是 sizeof(myStruct)
返回我 6。
但是如果我写:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
int i;
} myStruct;
普通添加的大小为 10 字节,对齐大小应为 12,这次 sizeof(myStruct) == 12
。
有人可以解释一下有什么区别吗?
I'm working on a 32-bit machine, so I suppose that the memory alignment should be 4 bytes. Say I have this struct:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
} myStruct;
The plain added size is 6 bytes, and I suppose that the aligned size should be 8, but sizeof(myStruct)
returns me 6.
However if I write:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
int i;
} myStruct;
the plain added size is 10 bytes, aligned size shall be 12, and this time sizeof(myStruct) == 12
.
Can somebody explain what is the difference?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
至少在大多数机器上,类型仅与与类型本身一样大的边界对齐[编辑:您实际上不能要求比这更多的对齐,因为您必须能够创建数组,并且您无法将填充插入数组]。在您的实现中,
short
显然是 2 个字节,而int
是 4 个字节。这意味着您的第一个结构与 2 字节边界对齐。由于所有成员均为 2 个字节,因此在它们之间不插入任何填充。
第二个包含一个 4 字节项目,它与 4 字节边界对齐。由于它前面有 6 个字节,因此在
v3
和i
之间插入 2 个字节的填充,从而在short
中提供 6 个字节的数据,两个填充字节,以及int
中的 4 个字节的数据,总共 12 个字节。At least on most machines, a type is only ever aligned to a boundary as large as the type itself [Edit: you can't really demand any "more" alignment than that, because you have to be able to create arrays, and you can't insert padding into an array]. On your implementation,
short
is apparently 2 bytes, andint
4 bytes.That means your first struct is aligned to a 2-byte boundary. Since all the members are 2 bytes apiece, no padding is inserted between them.
The second contains a 4-byte item, which gets aligned to a 4-byte boundary. Since it's preceded by 6 bytes, 2 bytes of padding is inserted between
v3
andi
, giving 6 bytes of data in theshort
s, two bytes of padding, and 4 more bytes of data in theint
for a total of 12.忘记拥有不同的成员,即使您编写的两个结构体的成员完全相同,但差异是它们声明的顺序不同,然后大小每个结构体的结构可以(并且经常是)不同。
例如,请参阅此内容,
使用
gcc-4.3.4
编译它,您会得到以下输出:也就是说,即使两个结构具有相同的成员,大小也不同!
Ideone 上的代码:http://ideone.com/HGGVl
底线是标准没有讨论如何应该进行填充,因此编译器可以自由地做出任何决定,并且您不能假设所有编译器都会做出相同的决定。
Forget about having different members, even if you write two structs whose members are exactly same, with a difference is that the order in which they're declared is different, then size of each struct can be (and often is) different.
For example, see this,
Compile it with
gcc-4.3.4
, and you get this output:That is, sizes are different even though both structs has same members!
Code at Ideone : http://ideone.com/HGGVl
The bottomline is that the Standard doesn't talk about how padding should be done, and so the compilers are free to make any decision and you cannot assume all compilers make the same decision.
默认情况下,值根据其大小对齐。因此,像
short
这样的 2 字节值在 2 字节边界上对齐,而像int
这样的 4 字节值在 4 字节边界上对齐。在您的示例中,在
i
之前添加 2 个字节的填充,以确保i
落在 4 字节边界上。(整个结构在至少与结构中的最大值一样大的边界上对齐,因此您的结构将与 4 字节边界对齐。)
实际规则根据平台而异 - 维基百科页面 数据结构对齐有更多详细信息。
编译器通常允许您通过(例如)#pragma pack 指令控制打包。
By default, values are aligned according to their size. So a 2-byte value like a
short
is aligned on a 2-byte boundary, and a 4-byte value like anint
is aligned on a 4-byte boundaryIn your example, 2 bytes of padding are added before
i
to ensure thati
falls on a 4-byte boundary.(The entire structure is aligned on a boundary at least as big as the biggest value in the structure, so your structure will be aligned to a 4-byte boundary.)
The actual rules vary according to the platform - the Wikipedia page on Data structure alignment has more details.
Compilers typically let you control the packing via (for example)
#pragma pack
directives.假设:
那么我个人会使用以下内容(您的编译器可能有所不同):
Assuming:
Then I personally would use the following (your compiler may differ):
首先,虽然填充的细节由编译器决定,但操作系统也强加了一些关于对齐要求的规则。这个答案假设您正在使用 gcc,尽管操作系统可能有所不同
要确定给定结构及其元素占用的空间,您可以遵循以下规则:
首先,假设该结构始终从与 < 正确对齐的地址开始em>所有数据类型。
然后对于结构中的每个条目:
sizeof(element)
给出的元素的原始大小。值得注意的是,这意味着
char[20]
数组的对齐要求与对普通
char
的要求。最后,结构体作为一个整体的对齐要求是其每个元素的对齐要求的最大值。
gcc 将在给定元素之后插入填充,以确保下一个元素(或结构,如果我们谈论的是最后一个元素)正确对齐。它永远不会重新排列结构中元素的顺序,即使这会节省内存。
现在对齐要求本身也有点奇怪。
0x0
、0x4
、0x8
或0xC
结尾的地址) 。请注意,这也适用于大于 4 字节的类型(例如double
和long double
)。double
只能放置在以0x0
或0x8
结尾的地址处。唯一的例外是long double
,尽管它实际上是 12 字节长,但它仍然是 4 字节对齐的。long double
是一个例外,它必须是 16 字节对齐的。Firstly, while the specifics of padding are left up to the compiler, the OS also imposes some rules as to alignment requirements. This answer assumes that you are using gcc, though the OS may vary
To determine the space occupied by a given struct and its elements, you can follow these rules:
First, assume that the struct always starts at an address that is properly aligned for all data types.
Then for every entry in the struct:
sizeof(element)
.Notably, this means that the alignment requirement for a
char[20]
array is the same asthe requirement for a plain
char
.Finally, the alignment requirement of the struct as a whole is the maximum of the alignment requirements of each of its elements.
gcc will insert padding after a given element to ensure that the next one (or the struct if we are talking about the last element) is correctly aligned. It will never rearrange the order of the elements in the struct, even if that will save memory.
Now the alignment requirements themselves are also a bit odd.
0x0
,0x4
,0x8
or0xC
). Note that this applies to types larger than 4 bytes as well (such asdouble
andlong double
).double
can only placed at an address ending in0x0
or0x8
. The only exception to this is thelong double
which is still 4-byte aligned even though it is actually 12-bytes long.long double
is an exception and must be 16-byte aligned.每种数据类型都需要在其自身大小的内存边界上对齐。因此,
short
需要在 2 字节边界上对齐,而int
需要在 4 字节边界上对齐。同样,long long
需要位于 8 字节边界上。Each data type needs to be aligned on a memory boundary of its own size. So a
short
needs to be on aligned on a 2-byte boundary, and anint
needs to be on a 4-byte boundary. Similarly, along long
would need to be on an 8-byte boundary.第二个
sizeof(myStruct)
为12
的原因是在v3
和i
之间插入填充以在 32 位边界处对齐i
。它有两个字节。维基百科相当清楚地解释了填充和对齐。
The reason for the second
sizeof(myStruct)
being12
is the padding that gets inserted betweenv3
andi
to aligni
at a 32-bit boundary. There is two bytes of it.Wikipedia explains the padding and alignment reasonably clearly.
在您的第一个结构中,由于每个项目的大小都是
short
,因此整个结构可以在short
边界上对齐,因此不需要在末尾添加任何填充。在第二个结构中,int(大概是 32 位)需要字对齐,因此它在
v3
和i
之间插入填充以对齐i
。In your first struct, since every item is of size
short
, the whole struct can be aligned onshort
boundaries, so it doesn't need to add any padding at the end.In the second struct, the int (presumably 32 bits) needs to be word aligned so it inserts padding between
v3
andi
to aligni
.该标准没有过多说明具有完整类型的结构的布局 - 这取决于编译器。它决定需要 int 从边界开始才能访问它,但由于它必须对 Shorts 进行子边界内存寻址,因此不需要填充它们
The standard doesn't say much about the layout of structs with complete types - it's up to to the compiler. It decided that it needs the int to start on a boundary to access it, but since it has to do sub-boundary memory addressing for the shorts there is no need to pad them
听起来它是根据每个 var 的大小与边界对齐的,因此如果您在之后移动了其中一个 Shorts,则该地址是正在访问的大小的倍数(因此 Shorts 对齐到 2,ints 对齐到 4 等) int,
sizeof(mystruct)
应该是 10。当然,这一切都取决于所使用的编译器及其依次使用的设置。Sounds like its being aligned to bounderies based on the size of each var, so that the address is a multiple of the size being accessed(so shorts are aligned to 2, ints aligned to 4 etc), if you moved one of the shorts after the int,
sizeof(mystruct)
should be 10. Of course this all depends on the compiler being used and what settings its using in turn.