从指定位置读取文本文件直到指定长度
由于我收到了一个非常糟糕的数据文件,我必须想出代码来从特定的起始位置和特定的长度读取非分隔文本文件,以构建可用的数据集。文本文件没有以任何方式分隔,但是我确实有我需要读取的每个字符串的开始和结束位置。我已经想出了这段代码,但我收到一个错误并且无法弄清楚为什么,因为如果我用 0 替换 395 它就可以了..
例如发票号码起始位置 = 395,结束位置 = 414, length = 20
using (StreamReader sr = new StreamReader(@"\\t.txt"))
{
char[] c = null;
while (sr.Peek() >= 0)
{
c = new char[20];//Invoice number string
sr.Read(c, 395, c.Length); //THIS IS GIVING ME AN ERROR
Debug.WriteLine(""+c[0] + c[1] + c[2] + c[3] + c[4]..c[20]);
}
}
这是我得到的错误:
System.ArgumentException: Offset and length were out of bounds for the array
or count is greater than the number of elements from
index to the end of the source collection. at
System.IO.StreamReader.Read(Char[] b
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
请注意
Seek()
对于 OP 想要的东西来说太低了。请参阅此答案来进行逐行解析。另外,正如 Jordan 提到的,
Seek()
存在字符编码和不同字符大小的问题(例如,对于非 ASCII 和非 ANSI 文件,如 UTF,这可能不适用于此问题) 。感谢您指出这一点。原始答案
Seek()
仅在流上可用,因此请尝试使用sr.BaseStream.Seek(..)
,或使用不同的流,如下所示:Please Note
Seek()
is too low level for what the OP wants. See this answer instead for line-by-line parsing.Also, as Jordan mentioned,
Seek()
has the issue of character encodings and varying character sizes (e.g. for non-ASCII and non-ANSI files, like UTF, which is probably not applicable to this question). Thanks for pointing that out.Original Answer
Seek()
is only available on a stream, so try usingsr.BaseStream.Seek(..)
, or use a different stream like such:这是我给你的建议:
Here is my suggestion for you:
(基于评论的新答案)
您正在解析发票数据,每个条目都在一个新行上,并且所需的数据位于每行的固定偏移处。 Stream.Seek() 对于您想要执行的操作来说级别太低,因为您将需要多次查找,每一行一次。而是使用以下内容:
(new answer based on comments)
You are parsing invoice data, with each entry on a new line, and the required data is at a fixed offset for every line. Stream.Seek() is too low level for what you want to do, because you will need several seeks, one for every line. Rather use the following:
很久以前就解决了,只是想发布建议的解决方案
Solved this ages ago, just wanted to post the solution that was suggested
我在 git hub 上的
Helpers
项目中创建了一个名为AdvancedStreamReader
的类:https://github.com/jsmunroe/Helpers/blob/master/Helpers/IO/AdvancedStreamReader.cs
它相当强大。它是 StreamReader 的子类,并保持所有功能不变。有一些注意事项:a)它在构造时重置流的位置; b) 在使用阅读器时不应寻找
BaseStream
; c) 如果换行符类型与环境不同且文件只能使用一种类型,则需要指定换行符类型。这里有一些单元测试来演示它是如何使用的。我写这篇文章是为了自己使用,但我希望它对其他人有帮助。
I've created a class called
AdvancedStreamReader
into myHelpers
project on git hub here:https://github.com/jsmunroe/Helpers/blob/master/Helpers/IO/AdvancedStreamReader.cs
It is fairly robust. It is a subclass of
StreamReader
and keeps all of that functionality intact. There are a few caveats: a) it resets the position of the stream when it is constructed; b) you should not seek theBaseStream
while you are using the reader; c) you need to specify the newline character type if it differs from the environment and the file can only use one type. Here are some unit tests to demonstrate how it is used.I wrote this for my own use, but I hope it will help other people.
395 是 c 数组中开始写入的索引。那里没有 395 索引,最大值是 19。
我会建议这样的事情。
然后使用
395 is the index in c array at which you start writing. There's no 395 index there, max is 19.
I would suggest something like this.
And then use