StreamReader 出现奇怪的错误 �
即使我使用 UTF-8 编码并且已将 detectorEncodingFromByteOrderMarks (BOM) 设置为 true,StreamReader 也会将“–”(alt+ 0150) 读取为 �。任何人都可以指导我吗?
StreamReader reads '–' (alt+ 0150) as � even if I have UTF-8 encoding and I have detectEncodingFromByteOrderMarks (BOM) set to true. Can any one guide me on this ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
该字节代码不会出现在 utf-8 编码文本中。以utf-8编码时为'\u2013',0xe2 + 0x80 + 0x93。如果在数字键盘上键入 Alt+0150 时得到此字符,则默认系统代码页可能是 1252。只需将 Encoding.Default 传递给 StreamReader 构造函数即可。
That byte code won't appear in utf-8 encoded text. It is '\u2013', 0xe2 + 0x80 + 0x93 when encoded in utf-8. If you get this character when you type Alt+0150 on the numeric keypad then your default system code page is probably 1252. Simply pass Encoding.Default to the StreamReader constructor.
您需要知道用于对文本进行编码的编码。没有办法解决这个问题。尝试不同的编码,直到获得所需的结果。
来自 MSDN:
这意味着使用该 BOM 只是一个额外的事情,可能会也可能不会起作用,或者可以很容易地被覆盖
You need to know the encoding that was used to encode the text. There's no way around that. Try different encodings until you get the desired results.
From MSDN:
Which means that using that BOM is just an extra thing that may or may not work or can be easily overriden
正如其他用户所写,此问题的可能原因是您尝试读取的文件的 ANSI 编码。当我以 ANSI 编码保存文件时,我重新创建了您所描述的问题。
尝试使用此代码:
Encoding.Default 参数在这里很重要。此代码应该正确读取您提到的字符。
As the other users wrote, the probable reason of this issue is an ANSI encoding of the file you are trying to read. I've recreated the issue you've described when I saved the file in ANSI encoding.
Try to use this code:
The Encoding.Default parameter is important in here. This code should read the character you've mentioned correctly.