通过不兼容的指针复制变量的位是否总是未定义的行为?

发布于 2025-01-18 13:59:48 字数 1740 浏览 4 评论 0原文

例如,这是否会

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

平台上导致不可预测的结果,而

  • unsignedfloat 均为 32 位的
  • 指针对于所有类型
  • unsigned 都具有固定大小code> 和 float 可以在内存的同一部分存储和加载。

我了解严格的别名规则,但大多数显示违反严格别名的问题案例的示例如下所示。

static int g(int *i, float *f) {
    *i = 1;
    *f = 0;
    return *i;
}

int h() {
    int n;
    return g(&n, (float *)&n);
}

根据我的理解,编译器可以自由地假设 if 是隐式restrict。如果编译器认为 *f = 0; 是多余的(因为 if 不能别名),或者如果考虑到 if< 的值,它可能是 0 /code> 是相同的。这是未定义的行为,因此从技术上讲,任何其他事情都可能发生。

然而,第一个例子有点不同。

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

抱歉我的措辞不清楚,但一切都是“就地”完成的。除了“复制 x 的位”之外,我想不出编译器可以解释该行 unsigned u = *(unsigned *)&x; 的任何其他方式。到u”。

在实践中,我在 https://godbolt.org/ 中测试的各种架构的所有编译器经过全面优化后都会产生相同的结果第一个示例的结果,以及第二个示例的不同结果(01)。

我知道从技术上讲,unsignedfloat 具有不同的大小和对齐要求,或者应该存储在不同的内存段中。在这种情况下,即使第一个代码也没有意义。但在大多数现代平台上,以下内容成立,第一个示例仍然是未定义的行为(它会产生不可预测的结果)吗?

  • unsignedfloat 都是 32 位,
  • 所有类型的指针都有固定的大小
  • unsignedfloat 可以存储到内存的同一部分并从内存的同一部分加载。

在实际代码中,我确实编写了

unsigned f(float x) {
    unsigned u;
    memcpy(&u, &x, sizeof(x));
    return u;
}

优化后的编译结果与使用指针转换相同。这个问题是关于对代码严格别名规则的标准的解释,例如第一个示例。

For example, can this

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

cause unpredictable results on a platform where,

  • unsigned and float are both 32-bit
  • a pointer has a fixed size for all types
  • unsigned and float can be stored to and loaded from the same part of memory.

I know about strict aliasing rules, but most examples showing problematic cases of violating strict aliasing is like the following.

static int g(int *i, float *f) {
    *i = 1;
    *f = 0;
    return *i;
}

int h() {
    int n;
    return g(&n, (float *)&n);
}

In my understanding, the compiler is free to assume that i and f are implicitly restrict. The return value of h could be 1 if the compiler thinks *f = 0; is redundant (because i and f can't alias), or it could be 0 if it puts into account that the values of i and f are the same. This is undefined behaviour, so technically, anything else can happen.

However, the first example is a bit different.

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

Sorry for my unclear wording, but everything is done "in-place". I can't think of any other way the compiler might interpret the line unsigned u = *(unsigned *)&x;, other than "copy the bits of x to u".

In practice, all compilers for various architectures I tested in https://godbolt.org/ with full optimization produce the same result for the first example, and varying results (either 0 or 1) for the second example.

I know it's technically possible that unsigned and float have different sizes and alignment requirements, or should be stored in different memory segments. In that case even the first code won't make sense. But on most modern platforms where the following holds, is the first example still undefined behaviour (can it produce unpredictable results)?

  • unsigned and float are both 32-bit
  • a pointer has a fixed size for all types
  • unsigned and float can be stored to and loaded from the same part of memory.

In real code, I do write

unsigned f(float x) {
    unsigned u;
    memcpy(&u, &x, sizeof(x));
    return u;
}

The compiled result is the same as using pointer casting, after optimization. This question is about interpretation of the standard about strict aliasing rules for code such as the first example.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

紧拥背影 2025-01-25 13:59:48

通过不兼容的指针复制变量的位总是未定义的行为吗?

是的。

规则是 https://port70.net/~nsz/c/ c11/n1570.html#6.5p7

对象的存储值只能由具有以下之一的左值表达式访问
以下类型:

  • 与对象的有效类型兼容的类型,
  • 与对象的有效类型兼容的类型的限定版本,
  • 是与有效类型对应的有符号或无符号类型的类型
    对象,
  • 是与合格版本相对应的有符号或无符号类型的类型
    对象的有效类型,
  • 聚合或联合类型,其中包含上述类型之一
    成员(递归地包括子聚合或包含联合的成员),或
  • 字符类型。

对象x 的有效类型是float - 它是用该类型定义的。

  • unsignedfloat 不兼容,
  • unsigned 不是 float 的合格版本,
  • unsigned 不是 float 的有符号或无符号类型,
  • unsigned 不是与 float 的限定版本对应的有符号或无符号类型,
  • unsigned 不是聚合或union 类型
  • unsigned 不是字符类型。

违反了“应”,它是未定义的行为(请参阅 https:// port70.net/~nsz/c/c11/n1570.html#4p2 )。没有其他解释。

我们还有 https://port70.net/~nsz/c /c11/n1570.html#J.2

在以下情况下该行为未定义:

  • 对象可以通过允许类型 (6.5) 的左值以外的方式访问其存储值。

Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

Yes.

The rule is https://port70.net/~nsz/c/c11/n1570.html#6.5p7 :

An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the
    object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the
    effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its
    members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

The effective type of the object x is float - it is defined with that type.

  • unsigned is not compatible with float,
  • unsigned is not a qualified version of float,
  • unsigned is not a signed or unsigned type of float,
  • unsigned is not a signed or unsigned type corresponding to qualified version of float,
  • unsigned is not an aggregate or union type
  • and unsigned is not a character type.

The "shall" is violated, it is undefined behavior (see https://port70.net/~nsz/c/c11/n1570.html#4p2 ). There is no other interpretation.

We also have https://port70.net/~nsz/c/c11/n1570.html#J.2 :

The behavior is undefined in the following circumstances:

  • An object has its stored value accessed other than by an lvalue of an allowable type (6.5).
夜声 2025-01-25 13:59:48

正如卡米尔所解释的,这是 UB。即使 intlong(或 longlong long),即使它们是别名,也不兼容别名。相同的尺寸。 (但有趣的是,unsigned intint 兼容)

这与大小相同或使用注释中建议的相同寄存器集无关,主要是一种让编译器在优化时假设不同指针不指向重叠内存的方法。他们仍然必须支持 C99 Union 类型双关,而不仅仅是 memcpy。因此,例如,如果 dst 和 src 具有不同类型,则 dst[i] = src[i] 循环在展开或矢量化时不需要检查可能的重叠。1

如果您要访问相同的整数数据,则标准要求您使用完全相同的类型,仅模数,例如 signedunsignedconst< /代码>。或者您使用(无符号)char*,这类似于 GNU C __attribute__((may_alias))


你的问题的另一部分似乎是为什么它在实践中似乎有效,尽管有 UB。
您的 godbolt 链接忘记链接您尝试过的实际编译器。

https://godbolt.org/z/rvj3d4e4o 显示 GCC4.1,来自 GCC 之前的版本不遗余力地支持像这样“明显的”本地编译时可见的情况,有时不会使用不可移植的习惯用法来破坏人们的错误代码 像这样。
它会从堆栈内存加载垃圾,除非您首先使用 -fno-strict-aliasing 使其 movd 到该位置。 (存储/重新加载而不是 movd %xmm0, %eax 是一个错过优化的错误,在大多数情况下已在更高版本的 GCC 中修复。)

f:     # GCC4.1 -O3
        movl    -4(%rsp), %eax
        ret
f:    # GCC4.1 -O3 -fno-strict-aliasing
        movss   %xmm0, -4(%rsp)
        movl    -4(%rsp), %eax
        ret

即使旧的 GCC 版本也会发出警告 warning: dereferencing type -双关语指针将破坏严格别名规则,这应该表明 GCC 注意到了这一点,并且认为它定义良好。后来选择支持此代码的 GCC 仍然发出警告。

有时在简单的情况下工作,但有时会崩溃,还是总是失败,这是否更好,这是有争议的。但鉴于 GCC -Wall 仍然对此发出警告,这可能是处理遗留代码或从 MSVC 移植的人们的便利性之间的一个很好的权衡。另一种选择是始终打破它,除非人们使用 -fno-strict-aliasing ,如果处理依赖于此行为的代码库,他们应该这样做。


成为 UB 并不意味着必须失败,

恰恰相反;例如,要真正捕获 C 抽象机中的每个有符号溢出,需要大量额外的工作,尤其是在将 2 + c - 3 等内容优化为 c - 1。这就是 gcc -fsanitize=undefined 尝试做的事情,在添加后添加 x86 jo 指令(除了它仍然进行常量传播,所以它只是添加 -1 code>,未检测到 INT_MAX 上的临时溢出。 noreferrer">https://godbolt.org/z/WM9jGT3ac)。而且严格别名似乎并不是它在运行时尝试检测的 UB 类型之一。

另请参阅 clang 博客文章: What Every C程序员应该了解未定义的行为


实现可以自由定义 ISO C 标准未定义的行为

例如,MSVC 总是定义这种别名行为,就像 GCC/clang/ICC 所做的那样-fno-strict-aliasing。当然,这并不能改变纯 ISO C 使其未定义的事实。

它只是意味着在那些特定的 C 实现上,代码保证按照您想要的方式工作,而不是偶然发生或通过事实上的编译器行为(如果它对于现代 GCC 来说足够简单)认识并做更“友好”的事情。

就像用于有符号整数溢出的 gcc -fwrapv 一样。


脚注 1:严格别名帮助代码生成的示例
#define QUALIFIER // restrict

void convert(float *QUALIFIER pf, const int *pi) {
    for(int i=0 ; i<10240 ; i++){
        pf[i] = pi[i];
    }
}

Godbolt 显示,使用 x86-64 的 GCC11.2 默认值 -O3,我们仅仅得到一个带有 < code>movdqu / cvtdq2ps / movups 和循环开销。使用 -O3 -fno-strict-aliasing ,我们可以获得两个版本的循环,并进行重叠检查以查看是否可以运行标量或 SIMD 版本。

是否存在严格别名有助于更好地生成代码的实际情况,而使用 restrict 无法实现相同的效果

您很可能有一个指针可能指向两个 int 数组中的任何一个,但绝对不在任何 float 变量上,因此您不能对其使用 restrict 。严格别名将使编译器仍然避免通过指针在存储周围溢出/重新加载 float 对象,即使 float 对象是全局变量或不能证明是本地的到函数。 (逃逸分析。)

或者一个struct node *,它绝对与树中的有效负载类型不同。

此外,大多数代码都使用restrict。这可能会变得相当麻烦。不仅在循环中,而且在每个处理结构指针的函数中。如果你犯了错误并承诺了一些不真实的事情,那么你的代码就被破坏了。

As Kamil explains, it's UB. Even int and long (or long and long long) aren't alias-compatible even when they're the same size. (But interestingly, unsigned int is compatible with int)

It's nothing to do with being the same size, or using the same register-set as suggested in a comment, it's mainly a way to let compilers assume that different pointers don't point to overlapping memory when optimizing. They still have to support C99 union type-punning, not just memcpy. So for example a dst[i] = src[i] loop doesn't need to check for possible overlap when unrolling or vectorizing, if dst and src have different types.1

If you're accessing the same integer data, the standard requires that you use the exact same type, modulo only things like signed vs. unsigned and const. Or that you use (unsigned) char*, which is like GNU C __attribute__((may_alias)).


The other part of your question seems to be why it appears to work in practice, despite the UB.
Your godbolt link forgot to link the actual compilers you tried.

https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" local compile-time-visible cases like this, to sometimes not break people's buggy code using non-portable idioms like this.
It loads garbage from stack memory, unless you use -fno-strict-aliasing to make it movd to that location first. (Store/reload instead of movd %xmm0, %eax is a missed-optimization bug that's been fixed in later GCC versions for most cases.)

f:     # GCC4.1 -O3
        movl    -4(%rsp), %eax
        ret
f:    # GCC4.1 -O3 -fno-strict-aliasing
        movss   %xmm0, -4(%rsp)
        movl    -4(%rsp), %eax
        ret

Even that old GCC version warns warning: dereferencing type-punned pointer will break strict-aliasing rules which should make it obvious that GCC notices this and does not consider it well-defined. Later GCC that do choose to support this code still warn.

It's debatable whether it's better to sometimes work in simple cases, but break other times, vs. always failing. But given that GCC -Wall does still warn about it, that's probably a good tradeoff between convenience for people dealing with legacy code or porting from MSVC. Another option would be to always break it unless people use -fno-strict-aliasing, which they should if dealing with codebases that depend on this behaviour.


Being UB doesn't mean required-to-fail

Just the opposite; it would take tons of extra work to actually trap on every signed overflow in the C abstract machine, for example, especially when optimizing stuff like 2 + c - 3 into c - 1. That's what gcc -fsanitize=undefined tries to do, adding x86 jo instructions after additions (except it still does constant-propagation so it's just adding -1, not detecting temporary overflow on INT_MAX. https://godbolt.org/z/WM9jGT3ac). And it seems strict-aliasing is not one of the kinds of UB it tries to detect at run time.

See also the clang blog article: What Every C Programmer Should Know About Undefined Behavior


An implementation is free to define behaviour the ISO C standard leaves undefined

For example, MSVC always defines this aliasing behaviour, like GCC/clang/ICC do with -fno-strict-aliasing. Of course, that doesn't change the fact that pure ISO C leaves it undefined.

It just means that on those specific C implementations, the code is guaranteed to work the way you want, rather than happening to do so by chance or by de-facto compiler behaviour if it's simple enough for modern GCC to recognize and do the more "friendly" thing.

Just like gcc -fwrapv for signed-integer overflows.


Footnote 1: example of strict-aliasing helping code-gen
#define QUALIFIER // restrict

void convert(float *QUALIFIER pf, const int *pi) {
    for(int i=0 ; i<10240 ; i++){
        pf[i] = pi[i];
    }
}

Godbolt shows that with the -O3 defaults for GCC11.2 for x86-64, we get just a SIMD loop with movdqu / cvtdq2ps / movups and loop overhead. With -O3 -fno-strict-aliasing, we get two versions of the loop, and an overlap check to see if we can run the scalar or the SIMD version.

Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with restrict

You might well have a pointer that might point into either of two int arrays, but definitely not at any float variable, so you can't use restrict on it. Strict-aliasing will let the compiler still avoid spill/reload of float objects around stores through the pointer, even if the float objects are global vars or otherwise aren't provably local to the function. (Escape analysis.)

Or a struct node * that definitely isn't the same type as the payload in a tree.

Also, most code doesn't use restrict all over the place. It could get quite cumbersome. Not just in loops, but in every function that deals with pointers to structs. And if you get it wrong and promise something that's not true, your code's broken.

打小就很酷 2025-01-25 13:59:48

该标准从来没有打算完全、准确、明确地划分已定义行为的程序和未定义行为的程序(*),而是依赖编译器编写者运用一定的常识。

(*) 如果它是出于这个目的,那么它会惨败,正如由此产生的大量混乱所证明的那样。

考虑以下两个代码片段:

/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;

/* Snippet #1 */
uint32_t proc1(int i, int j)
{
  u.ww[i] = 1;
  u.ff[j] = 2.0f;
  return u.ww[i];
}

/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
  *p1 = 1;
  *p2 = 2.0f;
  return *p1;
}

/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
  return proc2a(u.ww+i, u.ff+j);
}

很明显,该标准的作者希望在有意义的平台上对代码的第一个版本进行有意义的处理,但也很明显,至少 C99 及更高版本的一些作者版本并不打算要求第二个版本以同样的方式处理(C89 的一些作者可能希望“严格别名规则”仅适用于通过以下方式访问直接命名对象的情况另一种类型的指针,如图已发表的基本原理中给出的例子;基本原理中没有任何内容表明希望更广泛地应用它)。

另一方面,标准定义了 [] 运算符,使得 proc1 在语义上等同于:

uint32_t proc3(int i, int j)
{
  *(u.ww+i) = 1;
  *(u.ff+j) = 2.0f;
  return *(u.ww+i);
}

并且标准中没有任何内容暗示 proc()不应该具有相同的语义。 gcc 和 clang 似乎所做的是将 [] 运算符特殊化为与指针取消引用具有不同的含义,但标准中没有任何内容做出这样的区别。一致地解释标准的唯一方法是认识到带有 [] 的表单属于标准不要求实现有意义地处理但无论如何都依赖于它们来处理的操作类别。

像您的使用直接转换指针来访问与原始指针类型的对象关联的存储的示例的构造属于类似的构造类别,至少标准的一些作者可能期望(并且会要求,如果他们不这样做的话)不管有没有授权,编译器都会可靠地处理,因为没有可以想象的理由为什么高质量的编译器会这样做。然而,从那时起,clang 和 gcc 的发展就打破了这样的期望。即使 clang 和 gcc 通常会为函数生成有用的机器代码,它们也会寻求执行积极的过程间优化,从而无法预测哪些构造将 100% 可靠。与某些编译器不同,除非能够证明它们是合理的,否则不会应用潜在的优化转换,clang 和 gcc 寻求执行无法证明会影响程序行为的转换。

The Standard was never intended to fully, accurately, and unambiguously partition programs that have defined behavior and those that don't(*), but instead relies upon compiler writers to exercise a certain amount of common sense.

(*) If it was intended for that purpose, it fails miserably, as evidenced by the amount of confusion stemming from it.

Consider the following two code snippets:

/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;

/* Snippet #1 */
uint32_t proc1(int i, int j)
{
  u.ww[i] = 1;
  u.ff[j] = 2.0f;
  return u.ww[i];
}

/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
  *p1 = 1;
  *p2 = 2.0f;
  return *p1;
}

/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
  return proc2a(u.ww+i, u.ff+j);
}

It is clear that the authors of the Standard intended that the first version of the code be processed meaningfully on platforms where that would make sense, but it's also clear that at least some of the authors of C99 and later versions did not intend to require that the second version be processed likewise (some of the authors of C89 may have intended that the "strict aliasing rule" only apply to situations where a directly named object would be accessed via pointer of another type, as shown in the example given in the published Rationale; nothing in the Rationale suggests a desire to apply it more broadly).

On the other hand, the Standard defines the [] operator in such a fashion that proc1 is semantically equivalent to:

uint32_t proc3(int i, int j)
{
  *(u.ww+i) = 1;
  *(u.ff+j) = 2.0f;
  return *(u.ww+i);
}

and there's nothing in the Standard that would imply that proc() shouldn't have the same semantics. What gcc and clang seem to do is special-case the [] operator as having a different meaning from pointer dereferencing, but nothing in the Standard makes such a distinction. The only way to consistently interpret the Standard is to recognize that the form with [] falls in the category of actions which the Standard doesn't require that implementations process meaningfully, but relies upon them to handle anyway.

Constructs such as yours example of using a directly-cast pointer to access storage associated with an object of the original pointer's type fall in a similar category of constructs which at least some authors of the Standard likely expected (and would have demanded, if they didn't expect) that compilers would handle reliably, with or without a mandate, since there was no imaginable reason why a quality compiler would do otherwise. Since then, however, clang and gcc have evolved to defy such expectations. Even if clang and gcc would normally generate useful machine code for a function, they seek to perform aggressive inter-procedural optimizations that make it impossible to predict what constructs will be 100% reliable. Unlike some compilers which refrain from applying potential optimizing transforms unless they can prove that they are sound, clang and gcc seek to perform transforms that can't be proven to affect program behavior.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文