从无符号到有符号的类型安全转换?
例如,从 unsigned char *
转换为 signed char *
(或只是 char *
)是否安全?
Is it safe to convert, say, from an unsigned char *
to a signed char *
(or just a char *
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
访问是明确定义的,您可以通过指向与对象的动态类型相对应的有符号或无符号类型的指针来访问对象(3.10/15)。
此外,保证
signed char
不具有任何陷阱值,因此无论原始unsigned 的值是什么,您都可以安全地读取
对象是。signed char
指针char当然,您可以预期通过一个指针读取的值将与通过另一个指针读取的值不同。
编辑:关于 sellibitze 的评论,这就是 3.9.1/1 所说的。
所以确实看起来
signed char
可能有陷阱值。不错的收获!The access is well-defined, you are allowed to access an object through a pointer to signed or unsigned type corresponding to the dynamic type of the object (3.10/15).
Additionally,
signed char
is guaranteed not to have any trap values and as such you can safely read through thesigned char
pointer no matter what the value of the originalunsigned char
object was.You can, of course, expect that the values you read through one pointer will be different from the values you read through the other one.
Edit: regarding sellibitze's comment, this is what 3.9.1/1 says.
So indeed it seems that
signed char
may have trap values. Nice catch!转换应该是安全的,因为您所做的只是从一种字符类型转换为另一种字符类型,而另一种字符类型应该具有相同的大小。只需要注意在取消引用指针时代码期望的数据类型,因为两种数据类型的数值范围不同。 (即,如果指针指向的数字最初是无符号正数,那么一旦指针转换为有符号 char* 并且取消引用它,它可能会变成负数。)
The conversion should be safe, as all you're doing is converting from one type of character to another, which should have the same size. Just be aware of what sort of data your code is expecting when you dereference the pointer, as the numeric ranges of the two data types are different. (i.e. if your number pointed by the pointer was originally positive as unsigned, it might become a negative number once the pointer is converted to a signed char* and you dereference it.)
转换会更改类型,但不会影响位表示。从 unsigned char 转换为signed char 根本不会改变该值,但会影响该值的含义。
下面是一个示例:
在第一个示例中,您有一个值为 192 的 unsigned char,即二进制的 110000000。转换为有符号字符后,该值仍然是 110000000,但这恰好是 2s-complement 表示-64。有符号值以 2 补码表示形式存储。
在第二个示例中,我们的无符号初始值 (32) 小于 128,因此它似乎不受强制转换的影响。二进制表示为00100000,在2s补码表示中仍然是32。
要“安全”地将无符号字符转换为有符号字符,请确保该值小于 128。
Casting changes the type, but does not affect the bit representation. Casting from unsigned char to signed char does not change the value at all, but it affects the meaning of the value.
Here is an example:
In the first example, you have an unsigned char with value 192, or 110000000 in binary. After the cast to signed char, the value is still 110000000, but that happens to be the 2s-complement representation of -64. Signed values are stored in 2s-complement representation.
In the second example, our unsigned initial value (32) is less than 128, so it seems unaffected by the cast. The binary representation is 00100000, which is still 32 in 2s-complement representation.
To "safely" cast from unsigned char to signed char, ensure the value is less than 128.
这取决于您将如何使用指针。您只是转换指针类型。
It depends on how you are going to use the pointer. You are just converting the pointer type.
您可以安全地将
unsigned char*
转换为char *
,因为您调用的函数将期望 char 指针的行为,但是,如果您的 char 值超过127 那么你会得到一个与你预期不同的结果,所以只要确保你的无符号数组中的内容对于有符号数组来说是有效的。You can safely convert an
unsigned char*
to achar *
as the function you are calling will be expecting the behavior from a char pointer, but, if your char value goes over 127 then you will get a result that will not be what you expected, so just make certain that what you have in your unsigned array is valid for a signed array.我发现它在某些方面出了问题,从无符号字符转换为有符号字符。
第一,如果您将其用作数组的索引,则该索引可能会变为负数。
其次,如果输入到 switch 语句,可能会导致负输入,这通常是 switch 不期望的。
第三,它在算术右移上有不同的行为,
其结果与
因为前者是符号扩展而后者不是。
第四,有符号字符与无符号字符在不同的点引起下溢。
因此,常见的溢出检查
可能会返回与以下不同的结果
I've seen it go wrong in a few ways, converting to a signed char from an unsigned char.
One, if you're using it as an index to an array, that index could go negative.
Secondly, if inputted to a switch statement, it may result in a negative input which often is something the switch isn't expecting.
Third, it has different behavior on an arithmetic right shift
has a different result than
Because the former is sign-extended and the latter isn't.
Fourth, a signed character causes underflow at a different point than an unsigned character.
So a common overflow check,
could return a different result than
如果您仅处理 ASCII 数据,则安全。
Safe if you are dealing with only ASCII data.
我很惊讶它还没有被提及: Boost numericcast 应该可以解决问题 - 但当然仅限于数据。
指针始终是指针。通过将它们转换为不同的类型,您只需更改编译器解释所指向的数据的方式。
I'm astonished it hasn't been mentioned yet: Boost numeric cast should do the trick - but only for the data of course.
Pointers are always pointers. By casting them to a different type, you only change the way the compiler interprets the data pointed to.