GCC、-O2 和位域 - 这是错误还是功能？

发布于 2024-09-01 18:33:47 字数 1461 浏览 6 评论 0原文

今天，我在尝试位字段时发现了令人震惊的行为。为了讨论和简单起见，这里有一个示例程序：

#include <stdio.h>

struct Node
{
  int a:16 __attribute__ ((packed));
  int b:16 __attribute__ ((packed));

  unsigned int c:27 __attribute__ ((packed));
  unsigned int d:3 __attribute__ ((packed));
  unsigned int e:2 __attribute__ ((packed));
};

int main (int argc, char *argv[])
{
  Node n;
  n.a = 12345;
  n.b = -23456;
  n.c = 0x7ffffff;
  n.d = 0x7;
  n.e = 0x3;

  printf("3-bit field cast to int: %d\n",(int)n.d);

  n.d++;  

  printf("3-bit field cast to int: %d\n",(int)n.d);
}

该程序故意导致 3 位位字段溢出。这是使用“g++ -O0”编译时的（正确）输出：

3 位字段转换为 int：7
3 位字段转换为 int：0

这是使用“g++ -O2”（和 -O3）编译时的输出：

3 位字段转换为 int：7
3 位字段转换为 int：8

检查后一个示例的汇编，我发现：

movl    $7, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
movl    $8, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
xorl    %eax, %eax
addq    $8, %rsp

优化刚刚插入“8”，假设 7+1=8，而实际上数字是溢出并且为零。

幸运的是，据我所知，我关心的代码不会溢出，但这种情况让我感到害怕 - 这是一个已知的错误、一个功能，还是这是预期的行为？我什么时候才能期望 gcc 在这方面是正确的？

编辑（回复：签名/未签名）：

它被视为未签名，因为它被声明为未签名。将其声明为 int 你会得到输出（使用 O0）：

3 位字段转换为 int：-1
3 位字段转换为 int：0

在这种情况下，使用 -O2 会发生更有趣的事情：

3 位字段转换为 int：7
3 位字段转换为 int：8

我承认属性使用起来很可疑；在这种情况下，我担心的是优化设置的差异。

原文

Today I discovered alarming behavior when experimenting with bit fields. For the sake of discussion and simplicity, here's an example program:

#include <stdio.h>

struct Node
{
  int a:16 __attribute__ ((packed));
  int b:16 __attribute__ ((packed));

  unsigned int c:27 __attribute__ ((packed));
  unsigned int d:3 __attribute__ ((packed));
  unsigned int e:2 __attribute__ ((packed));
};

int main (int argc, char *argv[])
{
  Node n;
  n.a = 12345;
  n.b = -23456;
  n.c = 0x7ffffff;
  n.d = 0x7;
  n.e = 0x3;

  printf("3-bit field cast to int: %d\n",(int)n.d);

  n.d++;  

  printf("3-bit field cast to int: %d\n",(int)n.d);
}

The program is purposely causing the 3-bit bit-field to overflow. Here's the (correct) output when compiled using "g++ -O0":

3-bit field cast to int: 7
3-bit field cast to int: 0

Here's the output when compiled using "g++ -O2" (and -O3):

3-bit field cast to int: 7
3-bit field cast to int: 8

Checking the assembly of the latter example, I found this:

movl    $7, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
movl    $8, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
xorl    %eax, %eax
addq    $8, %rsp

The optimizations have just inserted "8", assuming 7+1=8 when in fact the number overflows and is zero.

Fortunately the code I care about doesn't overflow as far as I know, but this situation scares me - is this a known bug, a feature, or is this expected behavior? When can I expect gcc to be right about this?

Edit (re: signed/unsigned) :

It's being treated as unsigned because it's declared as unsigned. Declaring it as int you get the output (with O0):

3-bit field cast to int: -1
3-bit field cast to int: 0

An even funnier thing happens with -O2 in this case:

3-bit field cast to int: 7
3-bit field cast to int: 8

I admit that attribute is a fishy thing to use; in this case it's a difference in optimization settings I'm concerned about.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

天涯沦落人 2024-09-08 18:33:47

如果您想了解技术，那么当您使用 __attribute__ （包含两个连续下划线的标识符）时，您的代码就具有未定义的行为。

如果您在删除这些内容后得到相同的行为，那么在我看来这就像一个编译器错误。事实上，3 位字段被视为 7 意味着它被视为无符号，因此当您溢出时，它应该像任何其他无符号一样，并为您提供模算术。

将位字段视为有符号也是合法的。在这种情况下，第一个结果将是 -1、-3 或 -0（可能打印为 0 >)，第二个未定义（因为有符号整数的溢出会产生未定义的行为）。理论上，在 C89 或当前 C++ 标准下，其他值也可能是可能的，因为它们不限制有符号整数的表示。在 C99 或 C++0x 中，它只能是这三个（C99 将有符号整数限制为一个补码、两个补码或符号数值，而 C++0x 基于 C99 而不是 C90）。

哎呀：我没有给予足够的关注 - 因为它被定义为 unsigned，所以它必须被视为 unsigned，留下很小的回旋余地来摆脱它的存在编译器错误。