什么是 VC++打包位域时做什么?
为了澄清我的问题,让我们从一个示例程序开始:
#include <stdio.h>
#pragma pack(push,1)
struct cc {
unsigned int a : 3;
unsigned int b : 16;
unsigned int c : 1;
unsigned int d : 1;
unsigned int e : 1;
unsigned int f : 1;
unsigned int g : 1;
unsigned int h : 1;
unsigned int i : 6;
unsigned int j : 6;
unsigned int k : 4;
unsigned int l : 15;
};
#pragma pack(pop)
struct cc c;
int main(int argc, char **argv)
{ printf("%d\n",sizeof(c));
}
输出是“8”,这意味着我要打包的56位(7字节)被打包成8字节,看起来浪费了整个字节。出于对编译器如何在内存中放置这些位的好奇,我尝试将特定值写入 &c
,例如:
int main(int argc, char **argv)
{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}
可以预见的是,在 x86_64 上使用 Visual Studio 2010 年,会发生以下情况:
*pint = 0x00000000 000000FF :
c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
*pint = 0x00000000 0000FF00 :
c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
*pint = 0x00000000 00FF0000 :
c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
等等。
暂时忘记可移植性,假设您关心一个 CPU、一个编译器和一个运行时环境。为什么VC++不能把这个结构打包成7个字节呢?是字长的问题吗? 上的 MSDN 文档 #pragma pack
表示“成员的对齐方式将位于 n 的倍数 [在我的例子中为 1] 或成员大小的倍数(以较小者为准)的边界上。”谁能告诉我为什么我得到的是 8 号而不是 7 号?
To clarify my question, let's start off with an example program:
#include <stdio.h>
#pragma pack(push,1)
struct cc {
unsigned int a : 3;
unsigned int b : 16;
unsigned int c : 1;
unsigned int d : 1;
unsigned int e : 1;
unsigned int f : 1;
unsigned int g : 1;
unsigned int h : 1;
unsigned int i : 6;
unsigned int j : 6;
unsigned int k : 4;
unsigned int l : 15;
};
#pragma pack(pop)
struct cc c;
int main(int argc, char **argv)
{ printf("%d\n",sizeof(c));
}
The output is "8", meaning that the 56 bits (7 bytes) I want to pack are being packed into 8 bytes, seemingly wasting a whole byte. Curious about how the compiler was laying these bits out in memory, I tried writing specific values to &c
, e.g.:
int main(int argc, char **argv)
{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}
Predictably, on x86_64 using Visual Studio 2010, the following happens:
*pint = 0x00000000 000000FF :
c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
*pint = 0x00000000 0000FF00 :
c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
*pint = 0x00000000 00FF0000 :
c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0
etc.
Forget portability for a moment and assume you care about one CPU, one compiler, and one runtime environment. Why can't VC++ pack this structure into 7 bytes? Is it a word-length thing? The MSDN docs on #pragma pack
says "the alignment of a member will be on a boundary that is either a multiple of n [1 in my case] or a multiple of the size of the member, whichever is smaller." Can anyone give me some idea of why I get a sizeof 8 and not 7?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
MSVC++ 始终至少分配一个与您用于位字段的类型相对应的内存单元。您使用了
unsigned int
,这意味着最初会分配一个unsigned int
,当第一个unsigned int
耗尽时会分配另一个unsigned int
。无法强制 MSVC++ 修剪第二个unsigned int
的未使用部分。基本上,MSVC++ 将您的
unsigned int
解释为表达整个结构的对齐要求的一种方式。对位字段使用较小的类型(
unsigned short
和unsigned char
)并重新组合位字段,以便它们完全填充分配的单元 - 这样您应该能够将东西包装得尽可能紧密。MSVC++ always allocates at least a unit of memory that corresponds to the type you used for your bit-field. You used
unsigned int
, meaning that aunsigned int
is allocated initially, and anotherunsigned int
is allocated when the first one is exhausted. There's no way to force MSVC++ to trim the unused portion of the secondunsigned int
.Basically, MSVC++ interprets your
unsigned int
as a way to express the alignment requirements for the entire structure.Use smaller types for your bit-fields (
unsigned short
andunsigned char
) and regroup the bit-fields so that they fill the allocated unit entirely - that way you should be able to pack things as tightly as possible.位域以您定义的类型存储。由于您使用的是
unsigned int
,并且它不适合单个unsigned int
,因此编译器必须使用第二个整数并将最后 24 位存储在最后一个整数中。Bitfields are stored in the type that you define. Since you are using
unsigned int
, and it won't fit in a singleunsigned int
then the compiler must use a second integer and store the last 24 bits in that last integer.好吧,您使用的是 unsigned int,在本例中恰好是 32 位。 unsigned int 的下一个边界(适合位域)是 64 位 => 8 字节。
Well you are using unsigned int which happens to be 32 Bit in this case. The next boundary (to fit in the bitfield) for unsigned int is 64 Bit => 8 Bytes.
pst 是正确的。 成员在 1 字节边界上对齐(或更小,因为它是位字段)。整个结构的大小为 8,并在 8 字节边界上对齐。这符合标准和
pack
选项。文档从未说过最后不会有填充。pst is right. The members are aligned on 1-byte boundaries, (or smaller, since it's a bitfield). The overall structure has size 8, and is aligned on an 8-byte boundary. This complies with both the standard and the
pack
option. The docs never say there will be no padding at the end.为了给出另一个有趣的说明,请考虑您想要打包跨越类型边界的结构的情况。例如
,据我所知,这个结构不能使用 MSVC 打包成 6 个字节。不过,我们可以通过拆开前两个字段来得到想要的打包效果:
这样确实可以打包成6个字节。然而,访问原始成本字段是极其尴尬和丑陋的。一种方法是将 state_packed 指针转换为专门的虚拟结构:
如果有人知道更优雅的方法,我很想知道!
To give another interesting illustrates what's going on, consider the case where you want to pack a structure that crosses a type boundary. E.g.
This structure can't be packed into 6 bytes using MSVC as far as I know. However, we can get the desired packing effect by breaking up the first two fields:
This can indeed be packed into 6 bytes. However, accessing the original cost field is extremely awkward and ugly. One method is to cast a state_packed pointer to a specialized dummy struct:
If anyone knows a more elegant way of doing this, I would love to know!