如何确定 C# 中字符在代码页 850 中的索引?
我有一个使用代码页 850 编码的文本文件。我按以下方式读取该文件:
using (var reader = new StreamReader(filePath, Encoding.GetEncoding(850)))
{
string line;
while ((line = reader.ReadLine()) != null)
{
//...
}
//...
}
现在我需要循环中该字符从零开始的索引上方的字符串 line
中的每个字符它在代码页 850 中,类似于:
for (int i = 0; i < line.Length; i++)
{
int indexInCodepage850 = GetIndexInCodepage850(line[i]); // ?
//...
}
这可能吗?int GetIndexInCodepage850(char c)
是什么样子?
I have a text file which is encoded with codepage 850. I am reading this file the following way:
using (var reader = new StreamReader(filePath, Encoding.GetEncoding(850)))
{
string line;
while ((line = reader.ReadLine()) != null)
{
//...
}
//...
}
Now I need for every character in the string line
in the loop above the zero-based index of that character which it has in codepage 850, something like:
for (int i = 0; i < line.Length; i++)
{
int indexInCodepage850 = GetIndexInCodepage850(line[i]); // ?
//...
}
Is this possible and how could int GetIndexInCodepage850(char c)
look like?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用 Encoding.GetBytes() 就行了。 CP850是8位编码,因此字节数组的元素数量应该与字符串的字符数量一样多,每个元素都是字符的值。
Use Encoding.GetBytes() on the line. CP850 is an 8-bit encoding, so the byte array should have just as many elements as the string had characters, and each element is the value of the character.
只需将文件作为字节读取,您就会得到代码页 850 字符代码:
不过,您不会将其分成几行。您需要在数据中查找 CR 和 LF 的字符代码是 13 和 10。
Just read the file as bytes, and you have the codepage 850 character codes:
You don't get it separated into lines, though. The character codes for CR and LF that you need to look for in the data are 13 and 10.
你不需要。
您已经在 StreamReader 构造函数中指定了编码。
从 reader.ReadLine() 返回的字符串已经使用 CP850 进行编码
You don't need to.
You are already specifying the encoding in the streamreader constructor.
The string returned from reader.ReadLine() will already have been encoding using CP850