避免“文件结尾”错误
我正在尝试将制表符分隔的文件导入到表中。
问题是,有时,该文件将包含一个尴尬的记录,该记录具有两个“空值”,并导致我的程序抛出“意外的文件结尾”。
例如,每条记录将有 20 个字段。但最后一条记录只有两个字段(两个空值),因此会出现意外的 EOF。
目前我正在使用 StreamReader
。
我尝试计算行数并告诉 bcp 在“幻象空值”之前停止读取,但由于“幻象空值”,StreamReader
得到的行数不正确。
我已尝试使用以下代码来删除所有虚假代码(从网上借用的代码)。但它只是用空格替换字段(我想要不留下任何行的结果)。
Public Sub RemoveBlankRowsFromCVSFile2(ByVal filepath As String)
If filepath = DBNull.Value.ToString() Or filepath.Length = 0 Then Throw New ArgumentNullException("filepath")
If (File.Exists(filepath) = False) Then Throw New FileNotFoundException("Could not find CSV file.", filepath)
Dim tempFile As String = Path.GetTempFileName()
Using reader As New StreamReader(filepath)
Using writer As New StreamWriter(tempFile)
Dim line As String = Nothing
line = reader.ReadLine()
While Not line Is Nothing
If Not line.Equals(" ") Then writer.WriteLine(line)
line = reader.ReadLine()
End While
End Using
End Using
File.Delete(filepath)
File.Move(tempFile, filepath)
End Sub
我尝试过使用 SSIS,但遇到了 EOF 意外错误。
我做错了什么?
I'm trying to import a tab delimited file into a table.
The issue is, SOMETIMES, the file will include an awkward record that has two "null values" and causes my program to throw a "unexpected end of file".
For example, each record will have 20 fields. But the last record will have only two fields (two null values), and hence, unexpected EOF.
Currently I'm using a StreamReader
.
I've tried counting the lines and telling bcp to stop reading before the "phantom nulls", but StreamReader
gets an incorrect count of lines due to the "phantom nulls".
I've tried the following code to get rid of all bogus code (code borrowed off the net). But it just replaces the fields with empty spaces (I'd like the result of no line left behind).
Public Sub RemoveBlankRowsFromCVSFile2(ByVal filepath As String)
If filepath = DBNull.Value.ToString() Or filepath.Length = 0 Then Throw New ArgumentNullException("filepath")
If (File.Exists(filepath) = False) Then Throw New FileNotFoundException("Could not find CSV file.", filepath)
Dim tempFile As String = Path.GetTempFileName()
Using reader As New StreamReader(filepath)
Using writer As New StreamWriter(tempFile)
Dim line As String = Nothing
line = reader.ReadLine()
While Not line Is Nothing
If Not line.Equals(" ") Then writer.WriteLine(line)
line = reader.ReadLine()
End While
End Using
End Using
File.Delete(filepath)
File.Move(tempFile, filepath)
End Sub
I've tried using SSIS, but it encounters the EOF unexpected error.
What am I doing wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您将整个文件读入字符串变量(使用 reader.ReadToEnd()),您会得到整个文件吗?或者您只是将数据获取到那些幻象空值?
您是否尝试过使用 Reader.ReadBlock() 函数来尝试读取超过文件长度?
If you read the entire file into a string variable (using reader.ReadToEnd()) do you get the whole thing? or are you just getting the data up to those phantom nulls?
Have you tried using the Reader.ReadBlock() function to try and read past the file length?
在我们公司,我们每周进行数百次进口。如果文件未按照我们自动化流程同意的正确格式发送,我们会将其退回给发件人。如果最后一行错误,则不应处理该文件,因为它可能丢失信息或以其他方式损坏。
At our company we do hundreds of imports every week. If a file is not sent in the correct, agreed to format for our automated process, we return it to the sender. If the last line is wrong, the file should not be processed because it might be missing information or in some other way corrupt.
避免该错误的一种方法是使用 ReadAllLines,然后处理文件行数组,而不是处理文件。这也比 StreamReader 效率高很多。
您还可以使用将输出行保存在相同或不同的字符串数组中,并使用 File.WriteAllLines 一次性写入文件。
One way to avoid the error is to use ReadAllLines, then process the array of file lines instead of progressing through the file. This is also a lot more efficient than streamreader.
You can also use save the output lines in the same or a different string array and use File.WriteAllLines to write the file all at once.
您可以尝试使用内置的 .Net 对象来读取制表符分隔的文件。它是 Microsoft.VisualBasic.FileIO.TextFileParser。
You could try the built-in .Net object for reading tab-delimited files. It is Microsoft.VisualBasic.FileIO.TextFileParser.
这是使用位数组解决的,一次检查一位是否有可疑位。
This was solved using a bit array, checking one bit at a time for the suspect bit.