负零的 C 标准（1 的补码和带符号的量值）

发布于 2024-12-12 08:27:56 字数 1286 浏览 9 评论 0原文

所有这些功能都在我的机器上给出了预期的结果。它们都可以在其他平台上运行吗？

更具体地说，如果 x 在 1 的补码机器上具有位表示 0xffffffff 或在有符号大小机器上具有 0x80000000，标准对 (无符号)x 的表示有何规定？

另外，我认为 v2、v2a、v3、v4 中的（无符号）强制转换是多余的。这是正确的吗？

假设 sizeof(int) = 4 且 CHAR_BIT = 8

int logicalrightshift_v1 (int x, int n) {

    return (unsigned)x >> n;
}

int logicalrightshift_v2 (int x, int n) {

    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v2a (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v3 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v4 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}

int logicalrightshift_v5 (int x, int n) {

    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

int logicalrightshift_v6 (int x, int n) {

    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

原文

All of these functions gives the expected result on my machine. Do they all work on other platforms?

More specifically, if x has the bit representation 0xffffffff on 1's complement machines or 0x80000000 on signed magnitude machines what does the standard says about the representation of (unsigned)x ?

Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is this correct?

Assume sizeof(int) = 4 and CHAR_BIT = 8

int logicalrightshift_v1 (int x, int n) {

    return (unsigned)x >> n;
}

int logicalrightshift_v2 (int x, int n) {

    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v2a (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v3 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v4 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}

int logicalrightshift_v5 (int x, int n) {

    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

int logicalrightshift_v6 (int x, int n) {

    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

傾旎 2024-12-19 08:27:56

如果 x 在 1 上具有位表示 0xffffffff
补码机器或有符号机器上的 0x80000000 什么
标准是否规定了 (unsigned)x 的表示形式？

到无符号的转换是根据值指定的，而不是表示形式。如果您将 -1 转换为 unsigned，您总是得到 UINT_MAX（因此，如果您的 unsigned code> 是 32 位，您总是得到 4294967295）。无论您的实现使用的带符号数字的表示形式如何，都会发生这种情况。

同样，如果将 -0 转换为 unsigned，那么您总是得到0。 -0 在数值上等于 0。

请注意，补码或符号量值实现不需要支持负零；如果不存在，则访问此类表示会导致程序出现未定义的行为。

逐一检查您的函数：

int logicalrightshift_v1(int x, int n)
{
    return (unsigned)x >> n;
}

此函数对于 x 负值的结果将取决于 UINT_MAX，并且如果 （无符号）x>> n 不在 int 范围内。例如，无论机器使用什么表示形式表示有符号数，logicrightshift_v1(-1, 1) 都将返回值 UINT_MAX / 2。

int logicalrightshift_v2(int x, int n)
{
    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

几乎所有与此相关的内容都可以由实现定义。假设您尝试在 msb 中创建一个符号位为 1、值位为零的值，则无法通过使用移位来实现此操作 - 您可以使用 ~INT_MAX< /code>，但是在不允许负零的符号数值机器上允许有未定义的行为，并且允许在二进制补码机器上给出实现定义的结果。

0x7fffffff 和 0x80000000 的类型将取决于各种类型的范围，这将影响此表达式中其他值的提升方式。

int logicalrightshift_v2a(int x, int n)
{
    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

如果您创建一个不在 int 范围内的 unsigned 值（例如，给定 32 位 int，values > 0x7fffffff），然后 return 语句中的隐式转换会生成实现定义的值。这同样适用于 v3 和 v4。

int logicalrightshift_v5(int x, int n)
{
    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

这仍然是实现定义的，因为未指定 int 表示中的符号位是否对应于 unsigned 表示中的值位或填充位。如果它对应于填充位，则它可能是陷阱表示，在这种情况下，行为未定义。

int logicalrightshift_v6(int x, int n)
{
    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

适用于 v5 的相同注释也适用于此。

此外，我认为 v2、v2a、v3、v4 中的（无符号）强制转换是多余的。是
这是正确的吗？

这取决于。作为十六进制常量，如果该值在 int 范围内，则 0x80000000 将具有 int 类型；如果该值在 unsigned 范围内，则否则为 unsigned；否则，如果该值在 long 范围内，则为 long；否则unsigned long（因为该值在unsigned long允许的最小范围内）。

如果您希望确保它具有无符号类型，请在常量后加上 U 后缀，即 0x80000000U。

摘要：

将大于 INT_MAX 的数字转换为 int 给出实现定义的结果（或者实际上，允许实现定义的信号）。
将超出范围的数字转换为无符号是通过重复加减UINT_MAX + 1来完成的，这意味着它取决于数学值，而不是表示形式。
将负 int 表示形式检查为 unsigned 是不可移植的（但正int 表示形式是可以的）。
通过使用按位运算符生成负零并尝试使用结果值是不可移植的。

如果您想要“逻辑移位”，那么您应该在任何地方使用无符号类型。有符号类型设计用于处理重要的是值的算法，而不是表示。

If x has the bit representation 0xffffffff on 1's
complement machines or 0x80000000 on signed magnitude machines what
does the standard says about the representation of (unsigned)x ?

The conversion to unsigned is specified in terms of values, not representations. If you convert -1 to unsigned, you always get UINT_MAX (so if your unsigned is 32 bits, you always get 4294967295). This happens regardless of the representation of signed numbers that your implementation uses.

Likewise, if you convert -0 to unsigned then you always get 0. -0 is numerically equal to 0.

Note that a ones complement or sign-magnitude implementation is not required to support negative zeroes; if it does not, then accessing such a representation causes the program to have undefined behaviour.

Going through your functions one-by-one:

int logicalrightshift_v1(int x, int n)
{
    return (unsigned)x >> n;
}

The result of this function for negative values of x will depend on UINT_MAX, and will further be implementation-defined if (unsigned)x >> n is not within the range of int. For example, logicalrightshift_v1(-1, 1) will return the value UINT_MAX / 2 regardless of what representation the machine uses for signed numbers.

int logicalrightshift_v2(int x, int n)
{
    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

Almost everything about this is could be implementation-defined. Assuming that you are attempting to create a value in msb with 1 in the sign bit and zeroes in the value bits, you cannot do this portably by use of shifts - you can use ~INT_MAX, but this is allowed to have undefined behaviour on a sign-magnitude machine that does not allow negative zeroes, and is allowed to give an implementation-defined result on two's complement machines.

The types of 0x7fffffff and 0x80000000 will depend on the ranges of the various types, which will affect how other values in this expression are promoted.

int logicalrightshift_v2a(int x, int n)
{
    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

If you create an unsigned value that is not in the range of int (for example, given a 32bit int, values > 0x7fffffff) then the implicit conversion in the return statement produces an implementation-defined value. The same applies to v3 and v4.

int logicalrightshift_v5(int x, int n)
{
    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

This is still implementation defined, because it is unspecified whether the sign bit in the representation of int corresponds to a value bit or a padding bit in the representation of unsigned. If it corresponds to a padding bit it could be a trap representation, in which case the behaviour is undefined.

int logicalrightshift_v6(int x, int n)
{
    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

The same comments applying to v5 apply to this.

Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is
this correct?

It depends. As a hex constant, 0x80000000 will have type int if that value is within the range of int; otherwise unsigned if that value is within the range of unsigned; otherwise long if that value is within the range of long; otherwise unsigned long (because that value is within the minimum allowed range of unsigned long).

If you wish to ensure that it has unsigned type, then suffix the constant with a U, to 0x80000000U.

Summary:

Converting a number greater than INT_MAX to int gives an implementation-defined result (or indeed, allows an implementation-defined signal to be raised).
Converting an out-of-range number to unsigned is done by repeated addition or subtraction of UINT_MAX + 1, which means it depends on the mathematical value, not the representation.
Inspecting a negative int representation as unsigned is not portable (positive int representations are OK, though).
Generating a negative zero through use of bitwise operators and trying to use the resulting value is not portable.

If you want "logical shifts", then you should be using unsigned types everywhere. The signed types are designed for dealing with algorithms where the value is what matters, not the representation.

回复收藏 0 原文