将流编码为 UTF-8 而不是 ASCII 后无法读取流中的整数
我在 ASCII 中遇到变音符号问题,因此我现在将流编码为 UTF-8,这可以工作,但会带来问题。我通常在 ARTIST 之前读取 4 个字节,以确定 ARTIST=WHOEVER 的长度,
UTF8Encoding enc = new UTF8Encoding();
string response = enc.GetString(message, 0, bytesRead);
int posArtist = response.IndexOf("ARTIST");
BitConverter.ToInt32(message, posArtist - 4);
这对于 ASCII 来说非常有效。
十六进制编辑器示例只是为了说明读取长度不再像使用 ASCII 那样工作
以下是十六进制编辑器的示例屏幕截图:
"ARTIST=MANDY vs. Booka Shade" Length = 21
但这不适用于 UTF8 编码流。 这是一个屏幕截图:
"ARTIST=Paulseq" 长度 = E 但在图片中为 2E。
我在这里做错了什么?
I had problems with Umlauts in ASCII so I encode my Stream as UTF-8 now, which works, but it brings up a problem. I normally read 4 Bytes before ARTIST to determine the length of ARTIST=WHOEVER using
UTF8Encoding enc = new UTF8Encoding();
string response = enc.GetString(message, 0, bytesRead);
int posArtist = response.IndexOf("ARTIST");
BitConverter.ToInt32(message, posArtist - 4);
This works for ASCII perfectly.
The hex-editor examples are just to illustrate that reading the length doesn't work anymore like with ASCII
Here is an example-screenshot from a hex-editor:
"ARTIST=M.A.N.D.Y. vs. Booka Shade" Length = 21
However that doesn't work for the UTF8-encoded stream.
Here is a screenshot:
"ARTIST=Paulseq" Length = E but in the picture its 2E.
What am I doing wrong here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您的数据是错误的 - 您实际上在数据中包含字符“\0”,其中应该有二进制零
问题在于您如何创建此数据,而不是读取它
your data is wrong - you actually have the character '\0' in the data where there should be binary zeroes
The problem lies in how you created this data, not in the reading of it
如何从 ASCII 数据中得到 21 完全是个谜。阴影字节为十六进制,其实际值为 33。您无法从 BitConverter.ToInt32 中获取 21,这需要字节值(十六进制)15 00 00 00。
这一定是偶然发生的,但不知道那是什么事故可能看起来像。发布更多代码,包括编写此代码的代码。
It is an utter mystery how you got 21 out of the ASCII data. The shaded byte is in hex, its real value is 33. There's no way you can get 21 out of BitConverter.ToInt32, that requires bytes values (in hex) 15 00 00 00.
This must have worked by accident but no idea what that accident might look like. Post more code, including the code that writes this.
我的猜测是你正在混合工具。那是一个二进制流。应使用 BinaryReader 读取并使用 BinaryWriter 写入。写入文本时,使用 Encoder.GetBytes 获取要写入的原始字节,读取时使用 Encoder.GetString 读取读取的原始字节。 BinaryWriter/Reader 具有直接获取值(如长度)的方法。
My guess is that you are mixing tools. That is a binary stream. It should be read with a BinaryReader and written with a BinaryWriter. When writing text, use Encoder.GetBytes to get the raw bytes to write, and when reading use Encoder.GetString on the raw bytes read. BinaryWriter/Reader have methods for values (like lengths) directly.
只有字符串应该采用 UTF-8 编码/解码。如果您以二进制形式传递其他(非字符串)值,编码器将销毁它们。
Only the strings should be UTF-8 encoded/decoded. If you're passing other (non-string) values in binary, the encoders they will destroy them.