什么是 VC++打包位域时做什么?

发布于 2024-09-27 08:19:27 字数 1723 浏览 4 评论 0原文

为了澄清我的问题,让我们从一个示例程序开始:

#include <stdio.h>

#pragma pack(push,1)
struct cc {
    unsigned int a   :  3;  
    unsigned int b   : 16;
    unsigned int c   :  1;
    unsigned int d   :  1;
    unsigned int e   :  1;
    unsigned int f   :  1;
    unsigned int g   :  1;
    unsigned int h   :  1;
    unsigned int i   :  6;  
    unsigned int j   :  6;  
    unsigned int k   :  4;  
    unsigned int l   : 15;
};
#pragma pack(pop)

struct cc c;

int main(int argc, char **argv)

{   printf("%d\n",sizeof(c));
}

输出是“8”,这意味着我要打包的56位(7字节)被打包成8字节,看起来浪费了整个字节。出于对编译器如何在内存中放置这些位的好奇,我尝试将特定值写入 &c,例如:

int main(int argc, char **argv)

{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}

可以预见的是,在 x86_64 上使用 Visual Studio 2010 年,会发生以下情况:

*pint = 0x00000000 000000FF :

c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

*pint = 0x00000000 0000FF00 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0


*pint = 0x00000000 00FF0000 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

等等。

暂时忘记可移植性,假设您关心一个 CPU、一个编译器和一个运行时环境。为什么VC++不能把这个结构打包成7个字节呢?是字长的问题吗? 上的 MSDN 文档 #pragma pack 表示“成员的对齐方式将位于 n 的倍数 [在我的例子中为 1] 或成员大小的倍数(以较小者为准)的边界上。”谁能告诉我为什么我得到的是 8 号而不是 7 号?

To clarify my question, let's start off with an example program:

#include <stdio.h>

#pragma pack(push,1)
struct cc {
    unsigned int a   :  3;  
    unsigned int b   : 16;
    unsigned int c   :  1;
    unsigned int d   :  1;
    unsigned int e   :  1;
    unsigned int f   :  1;
    unsigned int g   :  1;
    unsigned int h   :  1;
    unsigned int i   :  6;  
    unsigned int j   :  6;  
    unsigned int k   :  4;  
    unsigned int l   : 15;
};
#pragma pack(pop)

struct cc c;

int main(int argc, char **argv)

{   printf("%d\n",sizeof(c));
}

The output is "8", meaning that the 56 bits (7 bytes) I want to pack are being packed into 8 bytes, seemingly wasting a whole byte. Curious about how the compiler was laying these bits out in memory, I tried writing specific values to &c, e.g.:

int main(int argc, char **argv)

{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}

Predictably, on x86_64 using Visual Studio 2010, the following happens:

*pint = 0x00000000 000000FF :

c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

*pint = 0x00000000 0000FF00 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0


*pint = 0x00000000 00FF0000 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

etc.

Forget portability for a moment and assume you care about one CPU, one compiler, and one runtime environment. Why can't VC++ pack this structure into 7 bytes? Is it a word-length thing? The MSDN docs on #pragma pack says "the alignment of a member will be on a boundary that is either a multiple of n [1 in my case] or a multiple of the size of the member, whichever is smaller." Can anyone give me some idea of why I get a sizeof 8 and not 7?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

千仐 2024-10-04 08:19:27

MSVC++ 始终至少分配一个与您用于位字段的类型相对应的内存单元。您使用了unsigned int,这意味着最初会分配一个unsigned int,当第一个unsigned int耗尽时会分配另一个unsigned int。无法强制 MSVC++ 修剪第二个 unsigned int 的未使用部分。

基本上,MSVC++ 将您的 unsigned int 解释为表达整个结构的对齐要求的一种方式。

对位字段使用较小的类型(unsigned shortunsigned char)并重新组合位字段,以便它们完全填充分配的单元 - 这样您应该能够将东西包装得尽可能紧密。

MSVC++ always allocates at least a unit of memory that corresponds to the type you used for your bit-field. You used unsigned int, meaning that a unsigned int is allocated initially, and another unsigned int is allocated when the first one is exhausted. There's no way to force MSVC++ to trim the unused portion of the second unsigned int.

Basically, MSVC++ interprets your unsigned int as a way to express the alignment requirements for the entire structure.

Use smaller types for your bit-fields (unsigned short and unsigned char) and regroup the bit-fields so that they fill the allocated unit entirely - that way you should be able to pack things as tightly as possible.

梦行七里 2024-10-04 08:19:27

位域以您定义的类型存储。由于您使用的是 unsigned int,并且它不适合单个 unsigned int,因此编译器必须使用第二个整数并将最后 24 位存储在最后一个整数中。

Bitfields are stored in the type that you define. Since you are using unsigned int, and it won't fit in a single unsigned int then the compiler must use a second integer and store the last 24 bits in that last integer.

梦魇绽荼蘼 2024-10-04 08:19:27

好吧,您使用的是 unsigned int,在本例中恰好是 32 位。 unsigned int 的下一个边界(适合位域)是 64 位 => 8 字节。

Well you are using unsigned int which happens to be 32 Bit in this case. The next boundary (to fit in the bitfield) for unsigned int is 64 Bit => 8 Bytes.

七禾 2024-10-04 08:19:27

pst 是正确的。 成员在 1 字节边界上对齐(或更小,因为它是位字段)。整个结构的大小为 8,并在 8 字节边界上对齐。这符合标准和 pack 选项。文档从未说过最后不会有填充。

pst is right. The members are aligned on 1-byte boundaries, (or smaller, since it's a bitfield). The overall structure has size 8, and is aligned on an 8-byte boundary. This complies with both the standard and the pack option. The docs never say there will be no padding at the end.

梦冥 2024-10-04 08:19:27

为了给出另一个有趣的说明,请考虑您想要打包跨越类型边界的结构的情况。例如

struct state {
    unsigned int cost     : 24; 
    unsigned int back     : 21; 
    unsigned int a        :  1; 
    unsigned int b        :  1; 
    unsigned int c        :  1;
};

,据我所知,这个结构不能使用 MSVC 打包成 6 个字节。不过,我们可以通过拆开前两个字段来得到想要的打包效果:

struct state_packed {
    unsigned short cost_1   : 16; 
    unsigned char  cost_2   :  8;
    unsigned short back_1   : 16; 
    unsigned char  back_2   :  5;
    unsigned char  a        :  1; 
    unsigned char  b        :  1; 
    unsigned char  c        :  1; 
};

这样确实可以打包成6个字节。然而,访问原始成本字段是极其尴尬和丑陋的。一种方法是将 state_packed 指针转换为专门的虚拟结构:

struct state_cost {
    unsigned int cost     : 24;
    unsigned int junk     :  8; 
};

state_packed    sc;
state_packed *p_sc = ≻

sc.a = 1;
(*(struct state_cost *)p_sc).cost = 12345;
sc.b = 1;

如果有人知道更优雅的方法,我很想知道!

To give another interesting illustrates what's going on, consider the case where you want to pack a structure that crosses a type boundary. E.g.

struct state {
    unsigned int cost     : 24; 
    unsigned int back     : 21; 
    unsigned int a        :  1; 
    unsigned int b        :  1; 
    unsigned int c        :  1;
};

This structure can't be packed into 6 bytes using MSVC as far as I know. However, we can get the desired packing effect by breaking up the first two fields:

struct state_packed {
    unsigned short cost_1   : 16; 
    unsigned char  cost_2   :  8;
    unsigned short back_1   : 16; 
    unsigned char  back_2   :  5;
    unsigned char  a        :  1; 
    unsigned char  b        :  1; 
    unsigned char  c        :  1; 
};

This can indeed be packed into 6 bytes. However, accessing the original cost field is extremely awkward and ugly. One method is to cast a state_packed pointer to a specialized dummy struct:

struct state_cost {
    unsigned int cost     : 24;
    unsigned int junk     :  8; 
};

state_packed    sc;
state_packed *p_sc = ≻

sc.a = 1;
(*(struct state_cost *)p_sc).cost = 12345;
sc.b = 1;

If anyone knows a more elegant way of doing this, I would love to know!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文