TCP 接收扩展 ASCII 或 utf-8 字符
对于倒问号 ¿
我收到两个字节 [-62][-65] 但如何获得可读的 utf-8 或 ASCII 字符编码?
For inverted question mark ¿
I receive two bytes [-62][-65] but how would i get readable utf-8 or ASCII character encoding?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是该字符的 UTF8 代码。 倒问号是Unicode代码点
191< /code>,在 UTF8 中,是
0xc2:0xbf
。您将它们视为带符号的字节。例如,有符号的
-62
是256-62
或无符号的194
- 即十六进制0xc2
。同样,有符号的
-65
是无符号的256-65
或191
- 即十六进制0xbf
。如果要将 UTF8 序列转换为代码点,可以使用下表。
例如,您的
0xc2:0xbf
是二进制11000010 10111111
,它与第二种情况匹配:That is the UTF8 code for that character. The inverted question mark is Unicode code point
191
which, in UTF8, is0xc2:0xbf
.You're seeing them as signed bytes. For example
-62
signed is256-62
or194
unsigned - that's hex0xc2
.Similarly,
-65
signed is256-65
or191
unsigned - that's hex0xbf
.If you want to convert your UTF8 sequence into a code point, you can use the table below.
For example, your
0xc2:0xbf
is binary11000010 10111111
which matches the second case:这 2 个字节可能是 UTF-8
对于 ASCII,您需要特定的代码页。
到底什么是“可读”字符编码?
Those 2 bytes probably are UTF-8
For ASCII you would need a specific codepage.
And what exactly is a 'readable' char encoding?
查看十六进制的字节值:
如果您查找 Unicode 信息,您可以看到,这实际上是构成倒问号字形的 UTF-8 编码的两个字节。
Look at the byte values in hexadecimal:
If you look up the Unicode information for the glyph in question, you can see that this is, inded, the two bytes that make up the UTF-8 encoding of the inverted question mark glyph.