printf(“%x”,1) 是否会调用未定义的行为?

发布于 2024-10-11 20:45:33 字数 728 浏览 8 评论 0原文

根据C标准(6.5.2.2第6段)

如果表示被调用函数的表达式的类型不包含 原型,对每个参数执行整数提升,并且参数 将 float 类型提升为 double 类型。这些称为默认参数 促销活动。如果参数的数量不等于参数的数量,则 行为未定义。如果函数是用包含原型的类型定义的,并且 原型要么以省略号 (, ...) 结尾,要么以后面的参数类型结尾 促销与参数的类型不兼容,行为未定义。 如果函数定义的类型不包含原型,并且 提升后的参数与之后的参数不兼容 促销时,行为未定义,但以下情况除外:

  • 一个提升类型是有符号整数类型,另一个提升类型是 对应的无符号整数类型,并且该值可以用两种类型表示;
  • 这两种类型都是指向字符类型的限定或非限定版本的指针,或者 无效。

因此,一般来说,将 int 传递给需要 unsigned int 的可变参数函数(反之亦然),只要传递的值适合两者类型。然而,printf 的规范如下(7.19.6.1 第 9 段):

如果转换规范无效,则行为未定义。如果有任何参数是 不是相应转换规范的正确类型,行为是 未定义。

有符号/无符号不匹配也不例外。

这是否意味着 printf("%x", 1) 会调用未定义的行为?

According to the C standard (6.5.2.2 paragraph 6)

If the expression that denotes the called function has a type that does not include a
prototype, the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument
promotions. If the number of arguments does not equal the number of parameters, the
behavior is undefined. If the function is defined with a type that includes a prototype, and
either the prototype ends with an ellipsis (, ...) or the types of the arguments after
promotion are not compatible with the types of the parameters, the behavior is undefined.
If the function is defined with a type that does not include a prototype, and the types of
the arguments after promotion are not compatible with those of the parameters after
promotion, the behavior is undefined, except for the following cases:

  • one promoted type is a signed integer type, the other promoted type is the
    corresponding unsigned integer type, and the value is representable in both types;
  • both types are pointers to qualified or unqualified versions of a character type or
    void.

Thus, in general, there is nothing wrong with passing an int to a variadic function that expects an unsigned int (or vice versa) as long as the value passed fits in both types. However, the specification for printf reads (7.19.6.1 paragraph 9):

If a conversion specification is invalid, the behavior is undefined. If any argument is
not the correct type for the corresponding conversion specification, the behavior is
undefined.

No exception is made for signed/unsigned mismatch.

Does this mean that printf("%x", 1) invokes undefined behavior?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

故事与诗 2024-10-18 20:45:33

我相信它在技术上是未定义的,因为 %x 的“正确类型”被指定为 unsigned int - 正如您所指出的,有符号/无符号也不例外这里不匹配。

printf 的规则适用于更具体的情况,因此会覆盖一般情况的规则(对于特定覆盖一般情况的另一个示例,通常允许传递 NULL 到需要 const char * 参数的函数,但将 NULL 传递给 strlen() 是未定义的行为。

我说“技术上”,因为我相信考虑到标准中的其他限制,实施需要故意不正常才能导致这种情况出现问题。

I believe it is technically undefined, because the "correct type" for %x is specified as unsigned int - and as you point out, there is no exception for signed/unsigned mismatch here.

The rules for printf are for a more specific case and thus override the rules for the general case (for another example of the specific overriding the general, it's allowable in general to pass NULL to a function expecting a const char * argument, but it's undefined behaviour to pass NULL to strlen()).

I say "technically", because I believe an implementation would need to be intentionally perverse to cause a problem for this case, given the other restrictions in the standard.

放我走吧 2024-10-18 20:45:33

不能,因为 %x 格式化的是 unsigned int,而常量表达式 1 的类型是 int,而它的值却可以表示为 unsigned int。该操作不是UB。

No, because %x formats an unsigned int, and the type of the constant expression 1 is int, while the value of it is expressible as an unsigned int. The operation is not UB.

飞烟轻若梦 2024-10-18 20:45:33

这是未定义的行为,与将整数类型的指针重新解释为相反符号的互补类型的原因相同。不幸的是,这在两个方向上都是不允许的,因为一个方向上的有效表示可能是另一个方向上的陷阱实现。

我看到从有符号到无符号的重新解释可能存在陷阱表示的唯一原因是符号表示的这种变态情况,其中无符号类型只是屏蔽了符号位。不幸的是,从标准 6.2.6.2 开始,这种情况是允许的。
在这样的架构上,有符号类型的所有负值可能是无符号类型的陷阱表示。

在您的示例中,这更加奇怪,因为不允许使用 1 表示无符号类型的陷阱。因此,要使其成为“真实”示例,您必须使用 -1 提出问题。

我认为人们仍然没有为任何架构编写具有这些功能的 C 编译器,因此如果标准的新版本可以废除这种令人讨厌的情况,那么生活肯定会变得更加容易。

It is undefined behavior, for the same reason that re-interpreting a pointer to an integer type to complementary type of opposite signedness. This isn't allowed, unfortunately, in both directions because a valid representation in one may be a trap implementation in the other.

The only reason I see that from signed to unsigned re-interpretation there may be a trap representation is this perverted case of sign representation where the unsigned type just masks out the sign bit. Unfortunately such a thing is allowed as of 6.2.6.2 of the standard.
On such an architecture all negative values of the signed type may be trap representations of the unsigned type.

In your example case this is even more weird, since having 1 a trap representation for the unsigned type is in turn not allowed. So to make it a "real" example, you'd have to ask your question with a -1.

I don't think that there is still any architecture for which people write C compilers that has these features, so definitively live would become more easy if a newer version of the standard could abolish this nasty case.

月亮坠入山谷 2024-10-18 20:45:33

TL;DR 这不是 UB。

作为 n. '代词' m.在此答案中指出,C 标准表示有符号整数类型的所有非负值都具有完全相同的表示形式作为相应的无符号类型,因此只要值在两种类型的范围内,就可以互换使用。

来自 C99 标准 6.2.5 类型 - 第 9 段和脚注 31:

9 有符号整数类型的非负值范围是一个子范围
相应的无符号整数类型,以及表示
每种类型中的相同值是相同的。 31)

31) 相同的表示和对齐要求旨在
暗示作为函数参数的可互换性,返回值
职能和工会成员。

完全相同的文本出现在 C11 标准的 6.2.5 类型 - 第 9 段和脚注 41 中。

TL;DR it is not UB.

As n. 'pronouns' m. pointed out in this answer, the C standard says that all non-negative values of a signed integer type have the exact same representation as the corresponding unsigned type, and therefore can be used interchangeable as long as the value is in the range of both types.

From the C99 standard 6.2.5 Types - Paragraph 9 and Footnote 31:

9 The range of nonnegative values of a signed integer type is a subrange
of the corresponding unsigned integer type, and the representation of
the same value in each type is the same. 31)

31) The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values from
functions, and members of unions.

The exact same text is in the C11 Standard in 6.2.5 Types - Paragraph 9 and Footnote 41.

掩耳倾听 2024-10-18 20:45:33

我相信它是未定义的。具有可变长度参数列表的函数在接受参数时没有隐式转换,因此 1 在传递到 时不会被转换为 unsigned int printf(),导致未定义的行为。

I believe it's undefined. Functions with a variable-length arguments list don't have an implicit conversion when accepting arguments, so 1 won't be cast to unsigned int when being past to printf(), causing undefined behavior.

°如果伤别离去 2024-10-18 20:45:33

标准的作者通常不会尝试在每个可以想象的极端情况下明确强制行为,特别是当存在 100% 的所有实现共享的明显正确行为,并且没有理由期望任何实现执行其他任何操作时。尽管标准明确要求有符号和无符号类型对于适合两者的值具有匹配的内存表示形式,但从理论上讲,实现可以以不同的方式将它们传递给可变参数函数。该标准并不禁止这种行为,但我没有看到任何证据表明作者有意允许这种行为。最有可能的是,他们根本没有考虑这种可能性,因为从来没有(据我所知)实现过这种方式。

如果代码在有符号值上使用 %x,清理实现可能会发出警告,尽管高质量的清理实现还应该提供一个选项来默默地接受此类代码。如果在诊断/清理模式下使用传递的值,则除了将传递的值处理为无符号或发出警告之外,没有理由执行任何其他操作。虽然标准可能禁止实现将任何在有符号值上使用 %x 的代码视为不可访问,但任何认为实现应该利用这种自由的人都应该被认为是白痴。

专门针对健全的非诊断实现的程序员不必担心在输出“uint8_t”值之类的内容时添加强制转换,但那些代码可能被提供给低级实现的程序员可能希望添加此类强制转换以防止编译器“此类实现可能会带来“优化”。

The authors of the Standard do not generally try to explicitly mandate behavior in every imaginable corner case, especially when there is an obvious correct behavior which is shared by 100% of all implementations, and there no reason to expect any implementation to do anything else. Despite the Standard's explicit requirement that signed and unsigned types have matching memory representations for values that fit in both, it would be theoretically possible for an implementation to pass them to variadic functions differently. The Standard doesn't forbid such behavior, but I see no evidence of the authors intentionally permitting it. Most likely, they simply didn't consider such a possibility since no implementation had ever (and so far as I know, has ever) worked that way.

It would probably be reasonable for a sanitizing implementation to squawk if code uses %x on a signed value, though a quality sanitizing implementation should also provide an option to silently accept such code. There's no reason for sane implementations to do anything other than either process the passed value as unsigned or squawk if it's used in a diagnostic/sanitizing mode. While the Standard might forbid an implementation from regarding as unreachable any code that uses %x on a signed value, anyone who thinks implementations should avail themselves of such freedom should be recognized as a moron.

Programmers who are targeting exclusively sane non-diagnostic implementations shouldn't need to worry about adding casts when outputting things like "uint8_t" values, but those whose code might be fed to moronic implementations might want to add such casts to prevent compilers from the "optimizations" such implementations might impose.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文