字节数组到字符串并返回.. -127 的问题
下面:
scala> (new String(Array[Byte](1, 2, 3, -1, -2, -127))).getBytes
res12: Array[Byte] = Array(1, 2, 3, -1, -2, 63)
为什么-127转换成63?以及如何将其恢复为 -127
[编辑:] 下面的 Java 版本(以表明它不仅仅是一个“Scala 问题”)
c:\tmp>type Main.java
public class Main {
public static void main(String [] args) {
byte [] b = {1, 2, 3, -1, -2, -127};
byte [] c = new String(b).getBytes();
for (int i = 0; i < 6; i++){
System.out.println("b:"+b[i]+"; c:"+c[i]);
}
}
}
c:\tmp>javac Main.java
c:\tmp>java Main
b:1; c:1
b:2; c:2
b:3; c:3
b:-1; c:-1
b:-2; c:-2
b:-127; c:63
In the following:
scala> (new String(Array[Byte](1, 2, 3, -1, -2, -127))).getBytes
res12: Array[Byte] = Array(1, 2, 3, -1, -2, 63)
why is -127 converted to 63? and how do I get it back as -127
[EDIT:] Java version below (to show that its not just a "Scala problem")
c:\tmp>type Main.java
public class Main {
public static void main(String [] args) {
byte [] b = {1, 2, 3, -1, -2, -127};
byte [] c = new String(b).getBytes();
for (int i = 0; i < 6; i++){
System.out.println("b:"+b[i]+"; c:"+c[i]);
}
}
}
c:\tmp>javac Main.java
c:\tmp>java Main
b:1; c:1
b:2; c:2
b:3; c:3
b:-1; c:-1
b:-2; c:-2
b:-127; c:63
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您调用的构造函数使得二进制到字符串的转换使用解码变得不明显:
String(byte[] bytes, Charset charset)
。你想要的是根本不使用解码。幸运的是,有一个构造函数可以实现这一点:
String(char[] value)
。现在您已将数据存储在字符串中,但您希望它完全按原样返回。但你猜怎么着!
getBytes(Charset charset)
是的,还会自动应用编码。幸运的是,有一个toCharArray()
方法。如果必须以字节开头并以字节结尾,则必须将 char 数组映射到字节:
因此,总结一下:在
String
和Array[Byte]
之间进行转换涉及编码和解码。如果要将二进制数据放入字符串中,则必须在字符级别上进行。但请注意,这会给您一个垃圾字符串(即结果不会是格式良好的 UTF-16,正如String
所期望的那样),因此您最好将其读出作为字符并将其转换回字节。您可以将字节向上移动,例如添加 512;然后你会得到一堆有效的单个
Char
代码点。但这是用16位来表示每8个,编码效率为50%。 Base64 是序列化二进制数据的更好选择(8 位代表 6,效率为 75%)。The constructor you're calling makes it non-obvious that binary-to-string conversions use a decoding:
String(byte[] bytes, Charset charset)
. What you want is to use no decoding at all.Fortunately, there's a constructor for that:
String(char[] value)
.Now you have the data in a string, but you want it back exactly as is. But guess what!
getBytes(Charset charset)
That's right, there's an encoding applied automatically also. Fortunately, there is atoCharArray()
method.If you must start with bytes and end with bytes, you then have to map the char arrays to bytes:
So, to summarize: converting between
String
andArray[Byte]
involves encoding and decoding. If you want to put binary data in a string, you have to do it at the level of characters. Note, however, that this will give you a garbage string (i.e. the result will not be well-formed UTF-16, asString
is expected to be), and so you'd better read it out as characters and convert it back to bytes.You could shift the bytes up by, say, adding 512; then you'd get a bunch of valid single
Char
code points. But this is using 16 bits to represent every 8, a 50% encoding efficiency. Base64 is a better option for serializing binary data (8 bits to represent 6, 75% efficient).字符串用于存储文本而不是二进制数据。
在您的默认字符编码中,没有 -127 的字符,因此它将其替换为“?”或 63。
编辑:Base64 是最好的选择,更好的是不使用文本来存储二进制数据。这是可以做到的,但不能使用任何标准字符编码。即你必须自己进行编码。
要从字面上回答您的问题,您可以使用自己的字符编码。这是一个非常糟糕的主意,因为任何文本都可能以与您所看到的相同的方式进行编码和破坏。使用 Base64 通过使用在任何编码中都是安全的字符来避免这种情况。
String is for storing text not binary data.
In your default character encoding there is no charcter for -127 so it replaces it with '?' or 63.
EDIT: Base64 is the best option, even better would be to not use text to store binary data. It can be done, but not with any standard character encoding. i.e. you have to do the encoding yourself.
To answer your question literally, you can use your own character encoding. This is a very bad idea as any text is likely to get encoded and mangled in the same way as you have seen. Using Base64 avoids this by using characters which are safe in any encoding.
StringOps 有一个方法
getBytes
,我认为这可能是人们真正想要的将 String 转换为 Array[Byte]http://www.scala-lang.org/api/2.10.2/index.html#scala.collection.immutable.StringOps
StringOps has a method
getBytes
, I think that is probably what one actually wants for converting String to Array[Byte]http://www.scala-lang.org/api/2.10.2/index.html#scala.collection.immutable.StringOps
使用正确的字符集:
Use correct charset: