GCC、-O2 和位域 - 这是错误还是功能?

发布于 2024-09-01 18:33:47 字数 1461 浏览 8 评论 0原文

今天,我在尝试位字段时发现了令人震惊的行为。为了讨论和简单起见,这里有一个示例程序:

#include <stdio.h>

struct Node
{
  int a:16 __attribute__ ((packed));
  int b:16 __attribute__ ((packed));

  unsigned int c:27 __attribute__ ((packed));
  unsigned int d:3 __attribute__ ((packed));
  unsigned int e:2 __attribute__ ((packed));
};

int main (int argc, char *argv[])
{
  Node n;
  n.a = 12345;
  n.b = -23456;
  n.c = 0x7ffffff;
  n.d = 0x7;
  n.e = 0x3;

  printf("3-bit field cast to int: %d\n",(int)n.d);

  n.d++;  

  printf("3-bit field cast to int: %d\n",(int)n.d);
}

该程序故意导致 3 位位字段溢出。这是使用“g++ -O0”编译时的(正确)输出:

3 位字段转换为 int:7

3 位字段转换为 int:0

这是使用“g++ -O2”(和 -O3)编译时的输出:

3 位字段转换为 int:7

3 位字段转换为 int:8

检查后一个示例的汇编,我发现:

movl    $7, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
movl    $8, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
xorl    %eax, %eax
addq    $8, %rsp

优化刚刚插入“8”,假设 7+1=8,而实际上数字是溢出并且为零。

幸运的是,据我所知,我关心的代码不会溢出,但这种情况让我感到害怕 - 这是一个已知的错误、一个功能,还是这是预期的行为?我什么时候才能期望 gcc 在这方面是正确的?

编辑(回复:签名/未签名):

它被视为未签名,因为它被声明为未签名。将其声明为 int 你会得到输出(使用 O0):

3 位字段转换为 int:-1

3 位字段转换为 int:0

在这种情况下,使用 -O2 会发生更有趣的事情:

3 位字段转换为 int:7

3 位字段转换为 int:8

我承认属性使用起来很可疑;在这种情况下,我担心的是优化设置的差异。

Today I discovered alarming behavior when experimenting with bit fields. For the sake of discussion and simplicity, here's an example program:

#include <stdio.h>

struct Node
{
  int a:16 __attribute__ ((packed));
  int b:16 __attribute__ ((packed));

  unsigned int c:27 __attribute__ ((packed));
  unsigned int d:3 __attribute__ ((packed));
  unsigned int e:2 __attribute__ ((packed));
};

int main (int argc, char *argv[])
{
  Node n;
  n.a = 12345;
  n.b = -23456;
  n.c = 0x7ffffff;
  n.d = 0x7;
  n.e = 0x3;

  printf("3-bit field cast to int: %d\n",(int)n.d);

  n.d++;  

  printf("3-bit field cast to int: %d\n",(int)n.d);
}

The program is purposely causing the 3-bit bit-field to overflow. Here's the (correct) output when compiled using "g++ -O0":

3-bit field cast to int: 7

3-bit field cast to int: 0

Here's the output when compiled using "g++ -O2" (and -O3):

3-bit field cast to int: 7

3-bit field cast to int: 8

Checking the assembly of the latter example, I found this:

movl    $7, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
movl    $8, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
xorl    %eax, %eax
addq    $8, %rsp

The optimizations have just inserted "8", assuming 7+1=8 when in fact the number overflows and is zero.

Fortunately the code I care about doesn't overflow as far as I know, but this situation scares me - is this a known bug, a feature, or is this expected behavior? When can I expect gcc to be right about this?

Edit (re: signed/unsigned) :

It's being treated as unsigned because it's declared as unsigned. Declaring it as int you get the output (with O0):

3-bit field cast to int: -1

3-bit field cast to int: 0

An even funnier thing happens with -O2 in this case:

3-bit field cast to int: 7

3-bit field cast to int: 8

I admit that attribute is a fishy thing to use; in this case it's a difference in optimization settings I'm concerned about.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

天涯沦落人 2024-09-08 18:33:47

如果您想了解技术,那么当您使用 __attribute__ (包含两个连续下划线的标识符)时,您的代码就具有未定义的行为。

如果您在删除这些内容后得到相同的行为,那么在我看来这就像一个编译器错误。事实上,3 位字段被视为 7 意味着它被视为无符号,因此当您溢出时,它应该像任何其他无符号一样,并为您提供模算术。

将位字段视为有符号也是合法的。在这种情况下,第一个结果将是 -1-3-0(可能打印为 0 >),第二个未定义(因为有符号整数的溢出会产生未定义的行为)。理论上,在 C89 或当前 C++ 标准下,其他值也可能是可能的,因为它们不限制有符号整数的表示。在 C99 或 C++0x 中,它只能是这三个(C99 将有符号整数限制为一个补码、两个补码或符号数值,而 C++0x 基于 C99 而不是 C90)。

哎呀:我没有给予足够的关注 - 因为它被定义为 unsigned,所以它必须被视为 unsigned,留下很小的回旋余地来摆脱它的存在编译器错误。

If you want to get technical, the minute you used __attribute__ (an identifier containing two consecutive underscores) your code has/had undefined behavior.

If you get the same behavior with those removed, it looks to me like a compiler bug. The fact that a 3-bit field is being treated as 7 means that it's being treated as an unsigned, so when you overflow it should do like any other unsigned, and give you modulo arithmetic.

It would also be legitimate for it to treat the bit-field as signed. In this case the first result would be -1, -3 or -0 (which might print as just 0), and the second undefined (since overflow of a signed integer gives undefined behavior). In theory, other values might be possible under C89 or the current C++ standard since they don't limit the representations of signed integers. In C99 or C++0x, it can only be those three (C99 limits signed integers to one's complement, two's complement or sign-magnitude and C++0x is based on C99 instead of C90).

Oops: I didn't pay close enough attention -- since it's defined as unsigned, it has to be treated as unsigned, leaving little wiggle room for getting out of its being a compiler bug.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文