为什么会发生这种情况:
char a = '\uffff'; //Highest value that char can take - 65535
byte b = (byte)a; //Casting a 16-bit value into 8-bit data type...! Isn't data lost here?
char c = (char)b; //Let's get the value back
int d = (int)c;
System.out.println(d); //65535... how?
基本上,我看到 char
是 16 位的。因此,如果将其转换为字节
,为什么不会丢失数据呢? (转换为 int 后值是相同的)
提前感谢您回答我的这个无知的小问题。 :P
编辑:哇啊,发现我原来的输出实际上按预期进行,但我只是更新了上面的代码。基本上,一个字符被转换为一个字节,然后转换回一个字符,并保留其原始的 2 字节值。这是怎么发生的?
How come this happens:
char a = '\uffff'; //Highest value that char can take - 65535
byte b = (byte)a; //Casting a 16-bit value into 8-bit data type...! Isn't data lost here?
char c = (char)b; //Let's get the value back
int d = (int)c;
System.out.println(d); //65535... how?
Basically, I saw that a char
is 16-bit. Therefore, if you cast it into a byte
, how come no data is lost? (Value is the same after casting into an int)
Thanks in advance for answering this little ignorant question of mine. :P
EDIT: Woah, found out that my original output actually did as expected, but I just updated the code above. Basically, a character is cast into a byte and then cast back into a char, and its original, 2-byte value is retained. How does this happen?
发布评论
评论(4)
正如 trojanfoe 所说,您对代码结果的困惑部分是由于符号扩展。我将尝试添加更详细的解释,以帮助您解决困惑。
正如您所指出的,这确实会导致信息丢失。这被认为是缩小转换< /a>.将 char 转换为字节“简单地丢弃除 n 个最低位之外的所有位”。
结果是:
0xFFFF -> 0xFF
将字节转换为字符被视为特殊转换。它实际上执行两次转换。首先,字节被符号扩展(新的高位从旧的符号位复制)到 int(正常的加宽转换)。其次,通过缩小转换将 int 转换为 char。
结果是:
0xFF -> 0xFFFFFFFF-> 0xFFFF
将 char 转换为 int 被视为 扩大转化。当 char 类型扩展为整型时,它会进行零扩展(新的高位设置为 0)。
结果是:
0xFFFF -> 0x0000FFFF。打印后,这将为您提供 65535。
我提供的三个链接是有关原始类型转换的官方 Java 语言规范详细信息。我强烈建议您看一下。它们并不是非常冗长(在本例中相对简单)。它详细描述了 java 在幕后通过类型转换执行的操作。这是许多开发人员普遍存在的误解。如果您仍然对任何步骤感到困惑,请发表评论。
As trojanfoe states, your confusion on the results of your code is partly due to sign-extension. I'll try to add a more detailed explanation that may help with your confusion.
As you noted, this DOES result in the loss of information. This is considered a narrowing conversion. Converting a char to a byte "simply discards all but the n lowest order bits".
The result is:
0xFFFF -> 0xFF
Converting a byte to a char is considered a special conversion. It actually performs TWO conversions. First, the byte is SIGN-extended (the new high order bits are copied from the old sign bit) to an int (a normal widening conversion). Second, the int is converted to a char with a narrowing conversion.
The result is:
0xFF -> 0xFFFFFFFF -> 0xFFFF
Converting a char to an int is considered a widening conversion. When a char type is widened to an integral type, it is ZERO-extended (the new high order bits are set to 0).
The result is:
0xFFFF -> 0x0000FFFF
. When printed, this will give you 65535.The three links I provided are the official Java Language Specification details on primitive type conversions. I HIGHLY recommend you take a look. They are not terribly verbose (and in this case relatively straightforward). It details exactly what java will do behind the scenes with type conversions. This is a common area of misunderstanding for many developers. Post a comment if you are still confused with any step.
这是符号扩展。尝试用
\u1234
而不是\uffff
看看会发生什么。It's sign extension. Try
\u1234
instead of\uffff
and see what happens.java
byte
已签名。这是违反直觉的。在几乎所有使用字节的情况下,程序员都希望使用无符号字节。如果直接将字节转换为 int,则极有可能是一个错误。这在几乎所有程序中都正确地完成了预期的转换:
根据经验,有符号字节的选择是一个错误。
java
byte
is signed. it's counter intuitive. in almost all situations where a byte is used, programmers would want an unsigned byte instead. it's extremely likely a bug if a byte is cast to int directly.This does the intended conversion correctly in almost all programs:
Empirically, the choice of signed byte is a mistake.
你的机器上出现了一些相当奇怪的东西。查看 Java 语言规范,第 4.2 章。 1:
...剪掉其他...
如果您的 JVM 符合标准,那么您的输出应该是
-1
。Some rather strange stuff going on your machine. Take a look at Java language specification, chapter 4.2.1:
... snip others...
If your JVM is standards compliant, then your output should be
-1
.