负零的 C 标准(1 的补码和带符号的量值)

发布于 2024-12-12 08:27:56 字数 1286 浏览 0 评论 0原文

所有这些功能都在我的机器上给出了预期的结果。它们都可以在其他平台上运行吗?

更具体地说,如果 x 在 1 的补码机器上具有位表示 0xffffffff 或在有符号大小机器上具有 0x80000000,标准对 (无符号)x 的表示有何规定?

另外,我认为 v2、v2a、v3、v4 中的(无符号)强制转换是多余的。这是正确的吗?

假设 sizeof(int) = 4 且 CHAR_BIT = 8

int logicalrightshift_v1 (int x, int n) {

    return (unsigned)x >> n;
}

int logicalrightshift_v2 (int x, int n) {

    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v2a (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v3 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v4 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}

int logicalrightshift_v5 (int x, int n) {

    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

int logicalrightshift_v6 (int x, int n) {

    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

All of these functions gives the expected result on my machine. Do they all work on other platforms?

More specifically, if x has the bit representation 0xffffffff on 1's complement machines or 0x80000000 on signed magnitude machines what does the standard says about the representation of (unsigned)x ?

Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is this correct?

Assume sizeof(int) = 4 and CHAR_BIT = 8

int logicalrightshift_v1 (int x, int n) {

    return (unsigned)x >> n;
}

int logicalrightshift_v2 (int x, int n) {

    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v2a (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v3 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}

int logicalrightshift_v4 (int x, int n) {

    return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}

int logicalrightshift_v5 (int x, int n) {

    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

int logicalrightshift_v6 (int x, int n) {

    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

傾旎 2024-12-19 08:27:56

如果 x 在 1 上具有位表示 0xffffffff
补码机器或有符号机器上的 0x80000000 什么
标准是否规定了 (unsigned)x 的表示形式?

无符号的转换是根据指定的,而不是表示形式。如果您将 -1 转换为 unsigned,您总是得到 UINT_MAX(因此,如果您的 unsigned code> 是 32 位,您总是得到 4294967295)。无论您的实现使用的带符号数字的表示形式如何,都会发生这种情况。

同样,如果将 -0 转换为 unsigned,那么您总是得到0-0 在数值上等于 0。

请注意,补码或符号量值实现不需要支持负零;如果不存在,则访问此类表示会导致程序出现未定义的行为。

逐一检查您的函数:

int logicalrightshift_v1(int x, int n)
{
    return (unsigned)x >> n;
}

此函数对于 x 负值的结果将取决于 UINT_MAX,并且如果 (无符号)x>> n 不在 int 范围内。例如,无论机器使用什么表示形式表示有符号数,logicrightshift_v1(-1, 1) 都将返回值 UINT_MAX / 2

int logicalrightshift_v2(int x, int n)
{
    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

几乎所有与此相关的内容都可以由实现定义。假设您尝试在 msb 中创建一个符号位为 1、值位为零的值,则无法通过使用移位来实现此操作 - 您可以使用 ~INT_MAX< /code>,但是在不允许负零的符号数值机器上允许有未定义的行为,并且允许在二进制补码机器上给出实现定义的结果。

0x7fffffff0x80000000 的类型将取决于各种类型的范围,这将影响此表达式中其他值的提升方式。

int logicalrightshift_v2a(int x, int n)
{
    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

如果您创建一个不在 int 范围内的 unsigned 值(例如,给定 32 位 int,values > 0x7fffffff),然后 return 语句中的隐式转换会生成实现定义的值。这同样适用于 v3 和 v4。

int logicalrightshift_v5(int x, int n)
{
    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

这仍然是实现定义的,因为未指定 int 表示中的符号位是否对应于 unsigned 表示中的值位或填充位。如果它对应于填充位,则它可能是陷阱表示,在这种情况下,行为未定义。

int logicalrightshift_v6(int x, int n)
{
    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

适用于 v5 的相同注释也适用于此。

此外,我认为 v2、v2a、v3、v4 中的(无符号)强制转换是多余的。是
这是正确的吗?

这取决于。作为十六进制常量,如果该值在 int 范围内,则 0x80000000 将具有 int 类型;如果该值在 unsigned 范围内,则否则为 unsigned;否则,如果该值在 long 范围内,则为 long;否则unsigned long(因为该值在unsigned long允许的最小范围内)。

如果您希望确保它具有无符号类型,请在常量后加上 U 后缀,即 0x80000000U


摘要:

  1. 将大于 INT_MAX 的数字转换为 int 给出实现定义的结果(或者实际上,允许实现定义的信号)。

  2. 将超出范围的数字转换为无符号是通过重复加减UINT_MAX + 1来完成的,这意味着它取决于数学,而不是表示形式。

  3. 将负 int 表示形式检查为 unsigned 是不可移植的(但正int 表示形式是可以的)。

  4. 通过使用按位运算符生成负零并尝试使用结果值是不可移植的。

如果您想要“逻辑移位”,那么您应该在任何地方使用无符号类型。有符号类型设计用于处理重要的是的算法,而不是表示。

If x has the bit representation 0xffffffff on 1's
complement machines or 0x80000000 on signed magnitude machines what
does the standard says about the representation of (unsigned)x ?

The conversion to unsigned is specified in terms of values, not representations. If you convert -1 to unsigned, you always get UINT_MAX (so if your unsigned is 32 bits, you always get 4294967295). This happens regardless of the representation of signed numbers that your implementation uses.

Likewise, if you convert -0 to unsigned then you always get 0. -0 is numerically equal to 0.

Note that a ones complement or sign-magnitude implementation is not required to support negative zeroes; if it does not, then accessing such a representation causes the program to have undefined behaviour.

Going through your functions one-by-one:

int logicalrightshift_v1(int x, int n)
{
    return (unsigned)x >> n;
}

The result of this function for negative values of x will depend on UINT_MAX, and will further be implementation-defined if (unsigned)x >> n is not within the range of int. For example, logicalrightshift_v1(-1, 1) will return the value UINT_MAX / 2 regardless of what representation the machine uses for signed numbers.

int logicalrightshift_v2(int x, int n)
{
    int msb = 0x4000000 << 1;
    return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}

Almost everything about this is could be implementation-defined. Assuming that you are attempting to create a value in msb with 1 in the sign bit and zeroes in the value bits, you cannot do this portably by use of shifts - you can use ~INT_MAX, but this is allowed to have undefined behaviour on a sign-magnitude machine that does not allow negative zeroes, and is allowed to give an implementation-defined result on two's complement machines.

The types of 0x7fffffff and 0x80000000 will depend on the ranges of the various types, which will affect how other values in this expression are promoted.

int logicalrightshift_v2a(int x, int n)
{
    return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}

If you create an unsigned value that is not in the range of int (for example, given a 32bit int, values > 0x7fffffff) then the implicit conversion in the return statement produces an implementation-defined value. The same applies to v3 and v4.

int logicalrightshift_v5(int x, int n)
{
    unsigned y;
    *(int *)&y = x;
    y >>= n;
    *(unsigned *)&x = y;
    return x;
}

This is still implementation defined, because it is unspecified whether the sign bit in the representation of int corresponds to a value bit or a padding bit in the representation of unsigned. If it corresponds to a padding bit it could be a trap representation, in which case the behaviour is undefined.

int logicalrightshift_v6(int x, int n)
{
    unsigned y;
    memcpy (&y, &x, sizeof (x));
    y >>= n;
    memcpy (&x, &y, sizeof (x));
    return x;
}

The same comments applying to v5 apply to this.

Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is
this correct?

It depends. As a hex constant, 0x80000000 will have type int if that value is within the range of int; otherwise unsigned if that value is within the range of unsigned; otherwise long if that value is within the range of long; otherwise unsigned long (because that value is within the minimum allowed range of unsigned long).

If you wish to ensure that it has unsigned type, then suffix the constant with a U, to 0x80000000U.


Summary:

  1. Converting a number greater than INT_MAX to int gives an implementation-defined result (or indeed, allows an implementation-defined signal to be raised).

  2. Converting an out-of-range number to unsigned is done by repeated addition or subtraction of UINT_MAX + 1, which means it depends on the mathematical value, not the representation.

  3. Inspecting a negative int representation as unsigned is not portable (positive int representations are OK, though).

  4. Generating a negative zero through use of bitwise operators and trying to use the resulting value is not portable.

If you want "logical shifts", then you should be using unsigned types everywhere. The signed types are designed for dealing with algorithms where the value is what matters, not the representation.

扎心 2024-12-19 08:27:56

如果您严格遵守标准,则不能保证所有这些内容在所有平台上都是相同的。

在 v5 中,您违反了严格别名,这是未定义的行为。

在 v2 - v4 中,您已经签署了右移,这是实现定义的。 (有关更多详细信息,请参阅注释)

在 v1 中,您已签署了无符号强制转换,这是在数字超出范围时定义的实现。

编辑:

v6 实际上可能在以下假设下工作:

  • 'int' 是 2 或 1 的补码。
  • unsignedint 的大小完全相同(字节和位都相同,并且密集排列)。
  • unsigned 的字节序与 int 的字节序匹配。
  • 填充和位布局是相同的:(有关更多详细信息,请参阅咖啡馆的评论。)

If you follow the standard to the word, none of these are guaranteed to be the same on all platforms.

In v5, you violate strict-aliasing, which is undefined behavior.

In v2 - v4, you have signed right-shift, which is implementation defined. (see comments for more details)

In v1, you have signed to unsigned cast, which is implementation defined when the number is out of range.

EDIT:

v6 might actually work given the following assumptions:

  • 'int' is either 2's or 1's complement.
  • unsigned and int are exactly the same size (in both bytes and bits, and are densely packed).
  • The endian of unsigned matches that of int.
  • The padding and bit-layout is the same: (See caf's comment for more details.)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文