什么是 VC++打包位域时做什么？

发布于 2024-09-27 08:19:27 字数 1723 浏览 7 评论 0原文

为了澄清我的问题，让我们从一个示例程序开始：

#include <stdio.h>

#pragma pack(push,1)
struct cc {
    unsigned int a   :  3;  
    unsigned int b   : 16;
    unsigned int c   :  1;
    unsigned int d   :  1;
    unsigned int e   :  1;
    unsigned int f   :  1;
    unsigned int g   :  1;
    unsigned int h   :  1;
    unsigned int i   :  6;  
    unsigned int j   :  6;  
    unsigned int k   :  4;  
    unsigned int l   : 15;
};
#pragma pack(pop)

struct cc c;

int main(int argc, char **argv)

{   printf("%d\n",sizeof(c));
}

输出是“8”，这意味着我要打包的56位（7字节）被打包成8字节，看起来浪费了整个字节。出于对编译器如何在内存中放置这些位的好奇，我尝试将特定值写入 &c，例如：

int main(int argc, char **argv)

{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}

可以预见的是，在 x86_64 上使用 Visual Studio 2010 年，会发生以下情况：

*pint = 0x00000000 000000FF :

c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

*pint = 0x00000000 0000FF00 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0


*pint = 0x00000000 00FF0000 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

等等。

暂时忘记可移植性，假设您关心一个 CPU、一个编译器和一个运行时环境。为什么VC++不能把这个结构打包成7个字节呢？是字长的问题吗？ 上的 MSDN 文档 #pragma pack 表示“成员的对齐方式将位于 n 的倍数 [在我的例子中为 1] 或成员大小的倍数（以较小者为准）的边界上。”谁能告诉我为什么我得到的是 8 号而不是 7 号？

原文

To clarify my question, let's start off with an example program:

#include <stdio.h>

#pragma pack(push,1)
struct cc {
    unsigned int a   :  3;  
    unsigned int b   : 16;
    unsigned int c   :  1;
    unsigned int d   :  1;
    unsigned int e   :  1;
    unsigned int f   :  1;
    unsigned int g   :  1;
    unsigned int h   :  1;
    unsigned int i   :  6;  
    unsigned int j   :  6;  
    unsigned int k   :  4;  
    unsigned int l   : 15;
};
#pragma pack(pop)

struct cc c;

int main(int argc, char **argv)

{   printf("%d\n",sizeof(c));
}

The output is "8", meaning that the 56 bits (7 bytes) I want to pack are being packed into 8 bytes, seemingly wasting a whole byte. Curious about how the compiler was laying these bits out in memory, I tried writing specific values to &c, e.g.:

int main(int argc, char **argv)

{
unsigned long long int* pint = &c;
*pint = 0xFFFFFFFF;
printf("c.a = %d", c.a);
...
printf("c.l = %d", c.l);
}

Predictably, on x86_64 using Visual Studio 2010, the following happens:

*pint = 0x00000000 000000FF :

c[0].a = 7
c[0].b = 1
c[0].c = 1
c[0].d = 1
c[0].e = 1
c[0].f = 1
c[0].g = 0
c[0].h = 0
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

*pint = 0x00000000 0000FF00 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 1
c[0].h = 127
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0


*pint = 0x00000000 00FF0000 :

c[0].a = 0
c[0].b = 0
c[0].c = 0
c[0].d = 0
c[0].e = 0
c[0].f = 0
c[0].g = 0
c[0].h = 32640
c[0].i = 0
c[0].j = 0
c[0].k = 0
c[0].l = 0

etc.

Forget portability for a moment and assume you care about one CPU, one compiler, and one runtime environment. Why can't VC++ pack this structure into 7 bytes? Is it a word-length thing? The MSDN docs on #pragma pack says "the alignment of a member will be on a boundary that is either a multiple of n [1 in my case] or a multiple of the size of the member, whichever is smaller." Can anyone give me some idea of why I get a sizeof 8 and not 7?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

千仐 2024-10-04 08:19:27

MSVC++ 始终至少分配一个与您用于位字段的类型相对应的内存单元。您使用了unsigned int，这意味着最初会分配一个unsigned int，当第一个unsigned int耗尽时会分配另一个unsigned int。无法强制 MSVC++ 修剪第二个 unsigned int 的未使用部分。

基本上，MSVC++ 将您的 unsigned int 解释为表达整个结构的对齐要求的一种方式。

对位字段使用较小的类型（unsigned short 和 unsigned char）并重新组合位字段，以便它们完全填充分配的单元 - 这样您应该能够将东西包装得尽可能紧密。

回复收藏 0 原文

梦行七里 2024-10-04 08:19:27

位域以您定义的类型存储。由于您使用的是 unsigned int，并且它不适合单个 unsigned int，因此编译器必须使用第二个整数并将最后 24 位存储在最后一个整数中。

回复收藏 0 原文

梦魇绽荼蘼 2024-10-04 08:19:27

好吧，您使用的是 unsigned int，在本例中恰好是 32 位。 unsigned int 的下一个边界（适合位域）是 64 位 => 8 字节。

回复收藏 0 原文

七禾 2024-10-04 08:19:27

pst 是正确的。成员在 1 字节边界上对齐（或更小，因为它是位字段）。整个结构的大小为 8，并在 8 字节边界上对齐。这符合标准和 pack 选项。文档从未说过最后不会有填充。

回复收藏 0 原文

梦冥 2024-10-04 08:19:27

为了给出另一个有趣的说明，请考虑您想要打包跨越类型边界的结构的情况。例如

struct state {
    unsigned int cost     : 24; 
    unsigned int back     : 21; 
    unsigned int a        :  1; 
    unsigned int b        :  1; 
    unsigned int c        :  1;
};

，据我所知，这个结构不能使用 MSVC 打包成 6 个字节。不过，我们可以通过拆开前两个字段来得到想要的打包效果：

struct state_packed {
    unsigned short cost_1   : 16; 
    unsigned char  cost_2   :  8;
    unsigned short back_1   : 16; 
    unsigned char  back_2   :  5;
    unsigned char  a        :  1; 
    unsigned char  b        :  1; 
    unsigned char  c        :  1; 
};

这样确实可以打包成6个字节。然而，访问原始成本字段是极其尴尬和丑陋的。一种方法是将 state_packed 指针转换为专门的虚拟结构：

struct state_cost {
    unsigned int cost     : 24;
    unsigned int junk     :  8; 
};

state_packed    sc;
state_packed *p_sc = ≻

sc.a = 1;
(*(struct state_cost *)p_sc).cost = 12345;
sc.b = 1;

如果有人知道更优雅的方法，我很想知道！

To give another interesting illustrates what's going on, consider the case where you want to pack a structure that crosses a type boundary. E.g.

struct state {
    unsigned int cost     : 24; 
    unsigned int back     : 21; 
    unsigned int a        :  1; 
    unsigned int b        :  1; 
    unsigned int c        :  1;
};

This structure can't be packed into 6 bytes using MSVC as far as I know. However, we can get the desired packing effect by breaking up the first two fields:

struct state_packed {
    unsigned short cost_1   : 16; 
    unsigned char  cost_2   :  8;
    unsigned short back_1   : 16; 
    unsigned char  back_2   :  5;
    unsigned char  a        :  1; 
    unsigned char  b        :  1; 
    unsigned char  c        :  1; 
};

This can indeed be packed into 6 bytes. However, accessing the original cost field is extremely awkward and ugly. One method is to cast a state_packed pointer to a specialized dummy struct:

struct state_cost {
    unsigned int cost     : 24;
    unsigned int junk     :  8; 
};

state_packed    sc;
state_packed *p_sc = ≻

sc.a = 1;
(*(struct state_cost *)p_sc).cost = 12345;
sc.b = 1;

If anyone knows a more elegant way of doing this, I would love to know!

回复收藏 0 原文

~没有更多了~