c rotation twos-complement unsigned-char

无符号字符旋转

发布于 2024-11-05 19:48:55 字数 302 浏览 4 评论 0原文

我对什么是 unsigned char 有点困惑。有符号字符是字符的位形式表示，对吗？一个示例问题让我们向右旋转 n 位位置，即使用此解决方案的无符号 char 的位：

unsigned char rotate(unsigned char x, int n) {
    unsigned char temp = x << 8 - n;
    x = x >> n;
    return (x | temp);
}

如果有人可以用 char 示例及其各自的位进行解释，我们将不胜感激。非常感谢。

原文

I'm a little bit confused as to what an unsigned char is. A signed char is the representation of the char in bit form right? A sample problem has us rotating to the right by n bit positions, the bits of an unsigned char with this solution:

unsigned char rotate(unsigned char x, int n) {
    unsigned char temp = x << 8 - n;
    x = x >> n;
    return (x | temp);
}

If anyone could explain with char examples and their respective bits, it would be greatly appreciated. Thanks so much.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云淡月浅 2024-11-12 19:48:55

signed char、char 和 unsigned char 都是整数类型。为了简单起见，我假设 CHAR_BIT 是 8，并且有符号类型是 2 的补码。因此：

signed char 是从 -128 到 +127 的数字
unsigned char 是从 0 到 255 的数字
char 的范围与signed char，或与 unsigned char 相同的范围，具体取决于您的 C 实现。

就 C 而言，字符只是 char 类型范围内的数字（尽管各种字符函数如 tolower 需要将值转换为无符号）即使 char 有符号，也要在输入过程中键入）。

因此，signed char 和 unsigned char 都是字符的位形式表示。对于 0 到 +127 范围内的数字，它们都使用相同的表示形式（只有一种方法可以用二进制表示正数）。对于该范围之外的数字，负数 n 的有符号表示与 n + 256 的无符号表示（2 的补码的定义）具有相同的位。

此代码使用 unsigned char 的原因是带有负符号值的右移具有实现定义的结果。带有负符号值的左移具有未定义的行为。通常左移的行为与无符号值相同，这是可以的，但右移会在左侧插入值为 1 的位，即所谓的“算术移位”，这不是这里想要的。无符号值总是移入零，正是移入零使该代码可以构建旋转结果的两个部分和/或将它们构建在一起。

因此，假设输入值为 x = 254 (11111110)，并且 n = 1，我们得到：

x << 7 is 0111111100000000
x >> 1 is         01111111
|      is 0111111101111111
convert to unsigned char to return is 01111111

如果我们使用有符号类型而不是 unsigned char，我们会很可能得到：

x is -2                           11111110
x << 7 is 11111111111111111111111100000000 (assuming 32-bit int, since
           smaller types are always promoted to int for arithmetic ops)
x >> 1 is implementation-defined, possibly 
          11111111111111111111111111111111
| is      11111111111111111111111111111111
convert to signed char to return is -1

因此，对 unsigned char 进行位操作会得到正确的答案，旋转 1 位以将 0 从末尾移动到开头。使用有符号字符进行位操作可能会给出错误的结果，如果负有符号值进行逻辑右移，则可能会给出正确的结果，但在真正不寻常的实现上可以做任何事情。

对于像旋转这样的位操作任务，您几乎总是希望使用无符号类型。它消除了实现依赖性（除了类型的宽度），并避免您必须分别推理负值和非负值。

signed char, char and unsigned char are all integer types. For the sake of simplicity I'll assume that CHAR_BIT is 8 and that signed types are 2's complement. So:

signed char is a number from -128 to +127
unsigned char is a number from 0 to 255
char is either the same range as signed char, or the same range as unsigned char, depending on your C implementation.

As far as C is concerned, a character is just a number within the range of the char type (although various character functions like tolower require the value to be cast to an unsigned type on the way in, even if char is signed).

So, signed char and unsigned char are both representation of the character in bit form. For numbers in the range 0 to +127 they both use the same representation (there's only one way to represent positive numbers in binary). For numbers outside that range, the signed representation of a negative number n is the same bits as the unsigned representation of n + 256 (definition of 2's complement).

The reason this code uses unsigned char is that right-shift with a negative signed value has implementation-defined result. Left shift with a negative signed value has undefined behavior. Usually left-shift behaves the same as for unsigned values, which is OK, but right shift inserts bits at the left-hand-side with value 1, a so-called "arithmetic shift", which isn't what's wanted here. Unsigned values always shift in zeros, and it's the shifting in of zero that lets this code build the two parts of the rotated result and or them together.

So, assuming an input value of x = 254 (11111110), and n = 1, we get:

x << 7 is 0111111100000000
x >> 1 is         01111111
|      is 0111111101111111
convert to unsigned char to return is 01111111

If we used a signed type instead of unsigned char, we'd quite possibly get:

x is -2                           11111110
x << 7 is 11111111111111111111111100000000 (assuming 32-bit int, since
           smaller types are always promoted to int for arithmetic ops)
x >> 1 is implementation-defined, possibly 
          11111111111111111111111111111111
| is      11111111111111111111111111111111
convert to signed char to return is -1

So the bit-manipulation with the unsigned char results in the correct answer, rotated by 1 bit to move the 0 from the end to the start. Bit-manipulation with the signed char, probably gives the wrong result, might give the right result if negative signed values do a logical right shift, but on really unusual implementations could do anything.

Pretty much always for bit-manipulation tasks like rotate, you want to use unsigned types. It removes the implementation-dependence (other than on the width of the type), and avoids you having to reason about negative and non-negative values separately.

回复收藏 0 原文

安静 2024-11-12 19:48:55

将变量声明为 unsigned char 告诉编译器将基础位模式视为 0 (00000000) 到 255 (11111111) 之间的数字。将其声明为 char 告诉编译器应用二进制补码底层位模式并将其视为从 -128 (10000000) 到 127 (01111111) 的数字。

考虑一个 3 位数字。如果它是无符号的，你有：

如果它有符号，你有：

算术方面的巧妙之处（正如该链接提到的）是你不必以与无符号的不同的方式对待有符号的二进制数。您只需进行实际的二进制数学运算，而不考虑有符号或无符号。但您必须将有符号/无符号解释应用于输入和输出。

在有符号领域中，您可能有：

2 + (-3) = 010 + 101 = 111 = -1

但在无符号领域中，这是：

2 + 5 = 010 + 101 = 111 = 7

因此，这完全是解释问题，因为添加的实际位模式和总和的位模式在两种情况下都是相同的。

Declaring a variable as unsigned char tells the compiler to treat the underlying bit pattern as a number from 0 (00000000) to 255 (11111111). Declaring it a char tells the compiler to apply two's complement to the underlying bit pattern and treat it as a number from -128 (10000000) to 127 (01111111).

Consider a 3-bit number. If it is unsigned, you have:

If it is signed you have:

What is neat with respect to arithmetic (as that link mentions) is that you don't have to treat signed binary numbers differently than unsigned ones. You just do the actual binary math without regard to signed or unsigned. But you do have to apply the signed/unsigned interpretation to the inputs and to the output.

In the signed realm you might have:

2 + (-3) = 010 + 101 = 111 = -1

But in the unsigned realm this is:

2 + 5 = 010 + 101 = 111 = 7

So it's all a matter of interpretation since the actual bit patters being added and the bit pattern of the sum are the same in both cases.

回复收藏 0 原文

季末如歌 2024-11-12 19:48:55

unsigned char 只是一种 8 位整数类型，可以采用 0 到 255 之间的值，而有符号 char 可以采用 -127 到 128 之间的值。在实际的机器代码中，没有真正的区别，除了一个：当你执行使用 >> 对有符号类型进行右移（无论是 char、short 还是 int）它都是作为算术移位执行的，这意味着对于负值（其中 MSB 为 1），将移入 1，而不是 0，并且上述代码将无法按预期工作。

编辑：上面将无符号字符旋转 3 位（有符号和无符号）的代码示例：

00110101 旋转无符号和有符号为 10100110。

但是对于前面有 1 的数字，您会得到算术移位，因此

11010001 旋转后无符号为 00111010。
11010001 旋转后有符号为 11111010。

回复收藏 0 原文

~没有更多了~