printf 的 h 和 hh 修饰符的用途是什么?

发布于 2024-10-10 07:29:49 字数 1550 浏览 5 评论 0原文

除了 %hn%hhn(其中 hhh 指定指向的-to 对象),printf 格式说明符的 hhh 修饰符有何意义?

由于标准要求应用于可变参数函数的默认提升,因此不可能传递 charshort 类型的参数(或其任何有符号/无符号变体) ) 到 printf

根据 7.19.6.1(7),h 修饰符:

指定以下 d、i、o、u、x 或 X 转换说明符适用于 短整型或无符号短整型参数(该参数将 已按整数促销进行促销,但其价值应为 在打印之前转换为短整型或无符号短整型); 或者后面的 n 转换说明符适用于指向短整型的指针 int 参数。

如果参数实际上是 shortunsigned Short 类型,则提升为 int,然后转换回 short > 或 unsigned short 将产生与提升为 int 相同的,而无需任何转换回来。因此,对于 shortunsigned Short 类型的参数,%d%u 等应该给出相同的结果结果为 %hd%hu 等(对于 char 类型和 hh 也是如此)。

据我所知, hhh 修饰符可能有用的唯一情况是当参数在外部传递给它一个 intshortunsigned Short 的范围,例如

printf("%hu", 0x10000);

,但我的理解是,传递这样的错误类型无论如何都会导致未定义的行为,因此您不能期望它打印 0 作者

我见过的一个真实案例是这样的代码:

char c = 0xf0;
printf("%hhx", c);

希望它打印 f0,尽管实现具有已签名的普通 char 类型(在这种情况下) , printf("%x", c) 将打印 ffffffff0 或类似内容)。但这种期望有道理吗?

(注意:发生的情况是原始类型是 char,它被提升为 int 并转换回 unsigned char 而不是 >char,从而改变打印的值。但是标准是否指定了这种行为,或者它是损坏的软件可能依赖的实现细节?)

Aside from %hn and %hhn (where the h or hh specifies the size of the pointed-to object), what is the point of the h and hh modifiers for printf format specifiers?

Due to default promotions which are required by the standard to be applied for variadic functions, it is impossible to pass arguments of type char or short (or any signed/unsigned variants thereof) to printf.

According to 7.19.6.1(7), the h modifier:

Specifies that a following d, i, o, u, x, or X conversion specifier applies to a
short int or unsigned short int argument (the argument will
have been promoted according to the integer promotions, but its value shall
be converted to short int or unsigned short int before printing);
or that a following n conversion specifier applies to a pointer to a short
int argument.

If the argument was actually of type short or unsigned short, then promotion to int followed by a conversion back to short or unsigned short will yield the same value as promotion to int without any conversion back. Thus, for arguments of type short or unsigned short, %d, %u, etc. should give identical results to %hd, %hu, etc. (and likewise for char types and hh).

As far as I can tell, the only situation where the h or hh modifier could possibly be useful is when the argument passed it an int outside the range of short or unsigned short, e.g.

printf("%hu", 0x10000);

but my understanding is that passing the wrong type like this results in undefined behavior anyway, so that you could not expect it to print 0.

One real world case I've seen is code like this:

char c = 0xf0;
printf("%hhx", c);

where the author expects it to print f0 despite the implementation having a plain char type that's signed (in which case, printf("%x", c) would print fffffff0 or similar). But is this expectation warranted?

(Note: What's going on is that the original type was char, which gets promoted to int and converted back to unsigned char instead of char, thus changing the value that gets printed. But does the standard specify this behavior, or is it an implementation detail that broken software might be relying on?)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

寂寞陪衬 2024-10-17 07:29:49

一个可能的原因是:为了在格式化输入函数中使用这些修饰符来实现对称?我知道这并不是绝对必要的,但也许这样做有价值?

尽管他们没有提到C99 基本原理文档,委员会确实提到它作为为什么 fscanf() 支持“%p”转换说明符的考虑因素(尽管这对于 C99 来说并不是什么新鲜事 - C90 中支持“%p”):

C89 中添加了使用 %p 的输入指针转换,尽管这显然有风险,但为了与 fprintf 对称。

在关于 fprintf() 的部分中,C99 基本原理文档确实讨论了添加“hh”,但仅让读者参考 fscanf() 部分:

C99 中添加了 %hh 和 %ll 长度修饰符(请参阅第 7.19.6.2 节)。

我知道这是一个脆弱的线索,但无论如何我都是在猜测,所以我想我会给出任何可能的论据。

另外,为了完整起见,“h”修饰符位于原始 C89 标准中 - 即使由于广泛的现有使用而并非严格必要,即使可能没有使用该修饰符的技术要求,它也可能存在。 。

One possible reason: for symmetry with the use of those modifiers in the formatted input functions? I know it wouldn't be strictly necessary, but maybe there was value seen for that?

Although they don't mention the importance of symmetry for the "h" and "hh" modifiers in the C99 Rationale document, the committee does mention it as a consideration for why the "%p" conversion specifier is supported for fscanf() (even though that wasn't new for C99 - "%p" support is in C90):

Input pointer conversion with %p was added to C89, although it is obviously risky, for symmetry with fprintf.

In the section on fprintf(), the C99 rationale document does discuss that "hh" was added, but merely refers the reader to the fscanf() section:

The %hh and %ll length modifiers were added in C99 (see §7.19.6.2).

I know it's a tenuous thread, but I'm speculating anyway, so I figured I'd give whatever argument there might be.

Also, for completeness, the "h" modifier was in the original C89 standard - presumably it would be there even if it wasn't strictly necessary because of widespread existing use, even if there might not have been a technical requirement to use the modifier.

若言繁花未落 2024-10-17 07:29:49

我能想到的唯一用途是传递 unsigned Shortunsigned char 并使用 %x 转换说明符。您不能简单地使用裸露的 %x - 该值可能会提升为 int 而不是 unsigned int,然后您就会出现未定义的行为。

您的替代方案是将参数显式转换为无符号;或者将 %hx / %hhx 与裸参数一起使用。

The only use I can think of is for passing an unsigned short or unsigned char and using the %x conversion specifier. You cannot simply use a bare %x - the value may be promoted to int rather than unsigned int, and then you have undefined behaviour.

Your alternatives are either to explicitly cast the argument to unsigned; or to use %hx / %hhx with a bare argument.

满栀 2024-10-17 07:29:49

%...x 模式下,所有值都被解释为无符号。因此,负数被打印为无符号转换。在大多数处理器使用的 2 补码算术中,有符号负数与其等价的无符号正数之间的位模式没有区别,后者由模算术定义(将字段的最大值加一到负数,根据符合C99标准)。许多软件——尤其是最有可能使用 %x 的调试代码——默默地假设有符号负值及其无符号转换的位表示是相同的,这仅在 2 上成立补机。

这种转换的机制是,值的十六进制表示总是暗示(可能不准确)数字已经以 2 的补码呈现,只要它没有达到不同整数表示具有不同范围的边缘条件。这甚至适用于算术表示,其中值 0 不是用全 0 的二进制模式表示的。

因此,在任何机器上,以十六进制显示为 unsigned long 的负 short 都将用 f 填充,因为促销,printf 将打印该内容。 是相同的,但它在视觉上确实会误导字段的大小,意味着根本不存在大量的范围。

%hx 截断显示的表示以避免这种填充,正如您从实际用例中得出的结论一样。

当传递超出 short 范围(应打印为 short)的 int 时,printf 的行为未定义,但迄今为止最简单的实现只是通过原始向下转换来丢弃高位,因此虽然规范不要求任何特定行为,但几乎任何理智的实现都只会执行截断。不过,通常有更好的方法可以做到这一点。

如果 printf 不填充值或显示带符号值的无符号表示形式,则 %h 不是很有用。

In %...x mode, all values are interpreted as unsigned. Negative numbers are therefore printed as their unsigned conversions. In 2's complement arithmetic, which most processors use, there is no difference in bit patterns between a signed negative number and its positive unsigned equivalent, which is defined by modulus arithmetic (adding the maximum value for the field plus one to the negative number, according to the C99 standard). Lots of software- especially the debugging code most likely to use %x- makes the silent assumption that the bit representation of a signed negative value and its unsigned cast is the same, which is only true on a 2's complement machine.

The mechanics of this cast are such that hexidecimal representations of value always imply, possibly inaccurately, that a number has been rendered in 2's complement, as long as it didn't hit an edge condition of where the different integer representations have different ranges. This even holds true for arithmetic representations where the value 0 is not represented with the binary pattern of all 0s.

A negative short displayed as an unsigned long in hexidecimal will therefore, on any machine, be padded with f, due to implicit sign extension in the promotion, which printf will print. The value is the same, but it is truly visually misleading as to the size of the field, implying a significant amount of range that simply isn't present.

%hx truncates the displayed representation to avoid this padding, exactly as you concluded from your real-world use case.

The behavior of printf is undefined when passed an int outside the range of short that should be printed as a short, but the easiest implementation by far simply discards the high bit by a raw downcast, so while the spec doesn't require any specific behavior, pretty much any sane implementation is going to just perform the truncation. There're generally better ways to do that, though.

If printf isn't padding values or displaying unsigned representations of signed values, %h isn't very useful.

无声无音无过去 2024-10-17 07:29:49

printf() 等的可变参数会使用默认转换自动提升,因此任何 shortchar 值都会提升为 int 当传递给函数时。

如果没有 hhh 修饰符,您必须屏蔽传递的值才能可靠地获得正确的行为。使用修饰符,您不再需要屏蔽这些值; printf() 实现正确地完成了这项工作。

具体来说,对于格式 %hxprintf() 内的代码可以执行以下操作:

va_list args;
va_start(args, format);

...

int i = va_arg(args, int);
unsigned short s = (unsigned short)i;
...print s correctly, as 4 hex digits maximum
...even on a machine with 64-bit `int`!

我愉快地假设 short 是 16 - 位数;当然,该标准实际上并不能保证这一点。

The variadic arguments to printf() et al are automatically promoted using the default conversions, so any short or char values are promoted to int when passed to the function.

In the absence of the h or hh modifiers, you would have to mask the values passed to get the correct behaviour reliably. With the modifiers, you no longer have to mask the values; the printf() implementation does the job properly.

Specifically, for the format %hx, the code inside printf() can do something like:

va_list args;
va_start(args, format);

...

int i = va_arg(args, int);
unsigned short s = (unsigned short)i;
...print s correctly, as 4 hex digits maximum
...even on a machine with 64-bit `int`!

I'm blithely assuming that short is a 16-bit quantity; the standard does not actually guarantee that, of course.

ㄟ。诗瑗 2024-10-17 07:29:49

我发现在将无符号字符格式化为十六进制时避免强制转换很有用:

        sprintf_s(tmpBuf, 3, "%2.2hhx", *(CEKey + i));

这是一个较小的编码便利,并且看起来比多次强制转换更干净(IMO)。

I found it useful to avoid casting when formatting unsigned chars to hex:

        sprintf_s(tmpBuf, 3, "%2.2hhx", *(CEKey + i));

It's a minor coding convenience, and looks cleaner than multiple casts (IMO).

醉梦枕江山 2024-10-17 07:29:49

另一个方便的地方是 snprintf 大小检查。
gcc7 添加了使用 snprintf 时的大小检查
所以这会失败,

char arr[4];
char x='r';
snprintf(arr,sizeof(arr),"%d",r);

因此它会强制您在格式化字符时使用%d时使用更大的字符

这里是显示这些修复的提交,而不是增加他们将%d更改为%h的字符数组大小。这也给出了更准确的描述

https://github.com/M ellanox/libvma/提交/b5cb1e34a04b40427d195b14763e462a0a705d23#diff-6258d0a11a435aa372068037fe161d24

another place it's handy is snprintf size check.
gcc7 added size check when using snprintf
so this will fail

char arr[4];
char x='r';
snprintf(arr,sizeof(arr),"%d",r);

so it forces you to use bigger char when using %d when formatting a char

here is a commit that shows those fixes instead of increasing the char array size they changed %d to %h. this also give more accurate description

https://github.com/Mellanox/libvma/commit/b5cb1e34a04b40427d195b14763e462a0a705d23#diff-6258d0a11a435aa372068037fe161d24

秉烛思 2024-10-17 07:29:49

我同意你的观点,这并不是绝对必要的,因此仅凭这个原因在 C 库函数中是不好的:)

对于不同标志的对称性来说,这可能是“好的”,但它大多会适得其反,因为它隐藏“转换为int”规则。

I agree with you that it is not strictly necessary, and so by that reason alone is no good in a C library function :)

It might be "nice" for the symmetry of the different flags, but it is mostly counter-productive because it hides the "conversion to int" rule.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文