为什么C#Encoding.utf8不同的字节解码对同一字符?
byte[] b1 = new byte[] { 60, 239, 191, 189, 14, 239, 191, 189, 2, 14, 62, 32, 23, 37, 239, 191, 189, 239, 191, 189, 127, 58, 50, 52, 56, 32, 95, 112, 117, 98, 95, 110, 117, 98, 95, 99, 108, 105, 101, 110, 116, 46, 99, 112, 112, 32, 58, 111, 110, 82, 101, 99, 101, 105, 118, 101, 84, 97, 112, 78, 111, 116, 105, 102, 121, 68, 97, 116, 97, 13, 10 };
byte[] b2 = new byte[] { 60, 215, 14, 235, 164, 2, 14, 62, 32, 23, 37, 207, 255, 127, 58, 50, 52, 56, 32, 95, 112, 117, 98, 95, 110, 117, 98, 95, 99, 108, 105, 101, 110, 116, 46, 99, 112, 112, 32, 58, 111, 110, 82, 101, 99, 101, 105, 118, 101, 84, 97, 112, 78, 111, 116, 105, 102, 121, 68, 97, 116, 97, 13, 10 };
var s1 = Encoding.UTF8.GetString(b1);
var s2 = Encoding.UTF8.GetString(b2);
var sc1 = Encoding.UTF8.GetByteCount(s1);
var sc2 = Encoding.UTF8.GetByteCount(s2);
令人惊讶的是, s1 == s2
评估 true
:S1和S2现在都包含相同的字符串“ %\ u007f:248 _pub_nub_client.cpp:onreceivetapnotifydata \ r \ n“
,即使字节序列不同。
和 sc1 == sc2 == 71
,但是 b2.length == 64
我想获得 s2
占据的字节数。 encoding.utf8.getByTecount(s2)
是71,这可能与b2.length不一致。那么有什么好方法可以解决这个问题吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
谢谢@igor,通过获取字节数组,我终于得到了我想要的东西
,我们可以使用
public caluetuple< long,int>? readlinePosition()
要读取启动位置和字节长度,然后我们可以使用public字符串? readstring(长位置,int长度)
获得实际值。我参考
streamReader
源代码:Thanks @igor, by getting the byte array, I finally got what I wanted
First we can use
public ValueTuple<long, int>? ReadLinePosition()
to read start position and bytes length, and then we can usepublic string? ReadString(long position, int length)
to get the actual value.I refer to
StreamReader
source code: https://referencesource.microsoft.com/#mscorlib/system/io/streamreader.cs,b5fe1efcec14de32