将流编码为 UTF-8 而不是 ASCII 后无法读取流中的整数

发布于 2024-09-26 13:35:28 字数 696 浏览 3 评论 0原文

我在 ASCII 中遇到变音符号问题,因此我现在将流编码为 UTF-8,这可以工作,但会带来问题。我通常在 ARTIST 之前读取 4 个字节,以确定 ARTIST=WHOEVER 的长度,

UTF8Encoding enc = new UTF8Encoding();
string response = enc.GetString(message, 0, bytesRead);
int posArtist = response.IndexOf("ARTIST");
BitConverter.ToInt32(message, posArtist - 4);

这对于 ASCII 来说非常有效。

十六进制编辑器示例只是为了说明读取长度不再像使用 ASCII 那样工作

以下是十六进制编辑器的示例屏幕截图: alt text

"ARTIST=MANDY vs. Booka Shade" Length = 21

但这不适用于 UTF8 编码流。 这是一个屏幕截图: alt text

"ARTIST=Paulseq" 长度 = E 但在图片中为 2E。

我在这里做错了什么?

I had problems with Umlauts in ASCII so I encode my Stream as UTF-8 now, which works, but it brings up a problem. I normally read 4 Bytes before ARTIST to determine the length of ARTIST=WHOEVER using

UTF8Encoding enc = new UTF8Encoding();
string response = enc.GetString(message, 0, bytesRead);
int posArtist = response.IndexOf("ARTIST");
BitConverter.ToInt32(message, posArtist - 4);

This works for ASCII perfectly.

The hex-editor examples are just to illustrate that reading the length doesn't work anymore like with ASCII

Here is an example-screenshot from a hex-editor:
alt text

"ARTIST=M.A.N.D.Y. vs. Booka Shade" Length = 21

However that doesn't work for the UTF8-encoded stream.
Here is a screenshot:
alt text

"ARTIST=Paulseq" Length = E but in the picture its 2E.

What am I doing wrong here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦情居士 2024-10-03 13:35:28

您的数据是错误的 - 您实际上在数据中包含字符“\0”,其中应该有二进制零

问题在于您如何创建此数据,而不是读取它

your data is wrong - you actually have the character '\0' in the data where there should be binary zeroes

The problem lies in how you created this data, not in the reading of it

不知在何时 2024-10-03 13:35:28

如何从 ASCII 数据中得到 21 完全是个谜。阴影字节为十六进制,其实际值为 33。您无法从 BitConverter.ToInt32 中获取 21,这需要字节值(十六进制)15 00 00 00。

这一定是偶然发生的,但不知道那是什么事故可能看起来像。发布更多代码,包括编写此代码的代码。

It is an utter mystery how you got 21 out of the ASCII data. The shaded byte is in hex, its real value is 33. There's no way you can get 21 out of BitConverter.ToInt32, that requires bytes values (in hex) 15 00 00 00.

This must have worked by accident but no idea what that accident might look like. Post more code, including the code that writes this.

漆黑的白昼 2024-10-03 13:35:28

我的猜测是你正在混合工具。那是一个二进制流。应使用 BinaryReader 读取并使用 BinaryWriter 写入。写入文本时,使用 Encoder.GetBytes 获取要写入的原始字节,读取时使用 Encoder.GetString 读取读取的原始字节。 BinaryWriter/Reader 具有直接获取值(如长度)的方法。

My guess is that you are mixing tools. That is a binary stream. It should be read with a BinaryReader and written with a BinaryWriter. When writing text, use Encoder.GetBytes to get the raw bytes to write, and when reading use Encoder.GetString on the raw bytes read. BinaryWriter/Reader have methods for values (like lengths) directly.

来世叙缘 2024-10-03 13:35:28

只有字符串应该采用 UTF-8 编码/解码。如果您以二进制形式传递其他(非字符串)值,编码器将销毁它们。

Only the strings should be UTF-8 encoded/decoded. If you're passing other (non-string) values in binary, the encoders they will destroy them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文