从无符号到有符号的类型安全转换?

发布于 2024-08-08 04:55:30 字数 102 浏览 3 评论 0原文

例如,从 unsigned char * 转换为 signed char * (或只是 char *)是否安全?

Is it safe to convert, say, from an unsigned char * to a signed char * (or just a char *?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

2024-08-15 04:55:30

访问是明确定义的,您可以通过指向与对象的动态类型相对应的有符号或无符号类型的指针来访问对象(3.10/15)。

此外,保证 signed char 不具有任何陷阱值,因此无论原始 unsigned 的值是什么,您都可以安全地读取 signed char 指针char 对象是。

当然,您可以预期通过一个指针读取的值将与通过另一个指针读取的值不同。

编辑:关于 sellibitze 的评论,这就是 3.9.1/1 所说的。

char、signed char 和 unsigned char 占用相同的存储量并具有相同的对齐要求(3.9);也就是说,它们具有相同的对象表示。对于字符类型,对象表示的所有位都参与值表示。对于无符号字符类型,值表示的所有可能的位模式都表示数字。

所以确实看起来 signed char 可能有陷阱值。不错的收获!

The access is well-defined, you are allowed to access an object through a pointer to signed or unsigned type corresponding to the dynamic type of the object (3.10/15).

Additionally, signed char is guaranteed not to have any trap values and as such you can safely read through the signed char pointer no matter what the value of the original unsigned char object was.

You can, of course, expect that the values you read through one pointer will be different from the values you read through the other one.

Edit: regarding sellibitze's comment, this is what 3.9.1/1 says.

A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.9); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers.

So indeed it seems that signed char may have trap values. Nice catch!

画离情绘悲伤 2024-08-15 04:55:30

转换应该是安全的,因为您所做的只是从一种字符类型转换为另一种字符类型,而另一种字符类型应该具有相同的大小。只需要注意在取消引用指针时代码期望的数据类型,因为两种数据类型的数值范围不同。 (即,如果指针指向的数字最初是无符号正数,那么一旦指针转换为有符号 char* 并且取消引用它,它可能会变成负数。)

The conversion should be safe, as all you're doing is converting from one type of character to another, which should have the same size. Just be aware of what sort of data your code is expecting when you dereference the pointer, as the numeric ranges of the two data types are different. (i.e. if your number pointed by the pointer was originally positive as unsigned, it might become a negative number once the pointer is converted to a signed char* and you dereference it.)

仙女山的月亮 2024-08-15 04:55:30

转换会更改类型,但不会影响位表示。从 unsigned char 转换为signed char 根本不会改变该值,但会影响该值的含义。

下面是一个示例:

#include <stdio.h>
int main(int args, char** argv) {

  /* example 1 */
  unsigned char a_unsigned_char = 192;
  signed char b_signed_char = b_unsigned_char;
  printf("%d, %d\n", a_signed_char, a_unsigned_char); //192, -64

  /* example 2 */
  unsigned char b_unsigned_char = 32; 
  signed char a_signed_char = a_unsigned_char;
  printf("%d, %d\n", b_signed_char, b_unsigned_char); //32, 32

  return 0;
}

在第一个示例中,您有一个值为 192 的 unsigned char,即二进制的 110000000。转换为有符号字符后,该值仍然是 110000000,但这恰好是 2s-complement 表示-64。有符号值以 2 补码表示形式存储。

在第二个示例中,我们的无符号初始值 (32) 小于 128,因此它似乎不受强制转换的影响。二进制表示为00100000,在2s补码表示中仍然是32。

要“安全”地将无符号字符转换为有符号字符,请确保该值小于 128。

Casting changes the type, but does not affect the bit representation. Casting from unsigned char to signed char does not change the value at all, but it affects the meaning of the value.

Here is an example:

#include <stdio.h>
int main(int args, char** argv) {

  /* example 1 */
  unsigned char a_unsigned_char = 192;
  signed char b_signed_char = b_unsigned_char;
  printf("%d, %d\n", a_signed_char, a_unsigned_char); //192, -64

  /* example 2 */
  unsigned char b_unsigned_char = 32; 
  signed char a_signed_char = a_unsigned_char;
  printf("%d, %d\n", b_signed_char, b_unsigned_char); //32, 32

  return 0;
}

In the first example, you have an unsigned char with value 192, or 110000000 in binary. After the cast to signed char, the value is still 110000000, but that happens to be the 2s-complement representation of -64. Signed values are stored in 2s-complement representation.

In the second example, our unsigned initial value (32) is less than 128, so it seems unaffected by the cast. The binary representation is 00100000, which is still 32 in 2s-complement representation.

To "safely" cast from unsigned char to signed char, ensure the value is less than 128.

昔梦 2024-08-15 04:55:30

这取决于您将如何使用指针。您只是转换指针类型。

It depends on how you are going to use the pointer. You are just converting the pointer type.

逐鹿 2024-08-15 04:55:30

您可以安全地将 unsigned char* 转换为 char *,因为您调用的函数将期望 char 指针的行为,但是,如果您的 char 值超过127 那么你会得到一个与你预期不同的结果,所以只要确保你的无符号数组中的内容对于有符号数组来说是有效的。

You can safely convert an unsigned char* to a char * as the function you are calling will be expecting the behavior from a char pointer, but, if your char value goes over 127 then you will get a result that will not be what you expected, so just make certain that what you have in your unsigned array is valid for a signed array.

怀里藏娇 2024-08-15 04:55:30

我发现它在某些方面出了问题,从无符号字符转换为有符号字符。

第一,如果您将其用作数组的索引,则该索引可能会变为负数。

其次,如果输入到 switch 语句,可能会导致负输入,这通常是 switch 不期望的。

第三,它在算术右移上有不同的行为,

int x = ...;
char c = 128
unsigned char u = 128

c >> x;

其结果与

u >> x;

因为前者是符号扩展而后者不是。

第四,有符号字符与无符号字符在不同的点引起下溢。

因此,常见的溢出检查

(c + x > c)

可能会返回与以下不同的结果

(u + x > u)

I've seen it go wrong in a few ways, converting to a signed char from an unsigned char.

One, if you're using it as an index to an array, that index could go negative.

Secondly, if inputted to a switch statement, it may result in a negative input which often is something the switch isn't expecting.

Third, it has different behavior on an arithmetic right shift

int x = ...;
char c = 128
unsigned char u = 128

c >> x;

has a different result than

u >> x;

Because the former is sign-extended and the latter isn't.

Fourth, a signed character causes underflow at a different point than an unsigned character.

So a common overflow check,

(c + x > c)

could return a different result than

(u + x > u)
幻想少年梦 2024-08-15 04:55:30

如果您仅处理 ASCII 数据,则安全。

Safe if you are dealing with only ASCII data.

灰色世界里的红玫瑰 2024-08-15 04:55:30

我很惊讶它还没有被提及: Boost numericcast 应该可以解决问题 - 但当然仅限于数据。

指针始终是指针。通过将它们转换为不同的类型,您只需更改编译器解释所指向的数据的方式。

I'm astonished it hasn't been mentioned yet: Boost numeric cast should do the trick - but only for the data of course.

Pointers are always pointers. By casting them to a different type, you only change the way the compiler interprets the data pointed to.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文