避免“文件结尾”错误

发布于 2024-08-09 11:44:45 字数 1213 浏览 12 评论 0原文

我正在尝试将制表符分隔的文件导入到表中。

问题是,有时,该文件将包含一个尴尬的记录,该记录具有两个“空值”,并导致我的程序抛出“意外的文件结尾”。

例如,每条记录将有 20 个字段。但最后一条记录只有两个字段(两个空值),因此会出现意外的 EOF。

目前我正在使用 StreamReader

我尝试计算行数并告诉 bcp 在“幻象空值”之前停止读取,但由于“幻象空值”,StreamReader 得到的行数不正确。

我已尝试使用以下代码来删除所有虚假代码(从网上借用的代码)。但它只是用空格替换字段(我想要不留下任何行的结果)。

Public Sub RemoveBlankRowsFromCVSFile2(ByVal filepath As String)
    If filepath = DBNull.Value.ToString() Or filepath.Length = 0 Then Throw New ArgumentNullException("filepath")

    If (File.Exists(filepath) = False) Then Throw New FileNotFoundException("Could not find CSV file.", filepath)


    Dim tempFile As String = Path.GetTempFileName()

    Using reader As New StreamReader(filepath)
        Using writer As New StreamWriter(tempFile)
            Dim line As String = Nothing
            line = reader.ReadLine()
            While Not line Is Nothing

                If Not line.Equals(" ") Then writer.WriteLine(line)

                line = reader.ReadLine()
            End While
        End Using
    End Using


    File.Delete(filepath)
    File.Move(tempFile, filepath)
End Sub

我尝试过使用 SSIS,但遇到了 EOF 意外错误。

我做错了什么?

I'm trying to import a tab delimited file into a table.

The issue is, SOMETIMES, the file will include an awkward record that has two "null values" and causes my program to throw a "unexpected end of file".

For example, each record will have 20 fields. But the last record will have only two fields (two null values), and hence, unexpected EOF.

Currently I'm using a StreamReader.

I've tried counting the lines and telling bcp to stop reading before the "phantom nulls", but StreamReader gets an incorrect count of lines due to the "phantom nulls".

I've tried the following code to get rid of all bogus code (code borrowed off the net). But it just replaces the fields with empty spaces (I'd like the result of no line left behind).

Public Sub RemoveBlankRowsFromCVSFile2(ByVal filepath As String)
    If filepath = DBNull.Value.ToString() Or filepath.Length = 0 Then Throw New ArgumentNullException("filepath")

    If (File.Exists(filepath) = False) Then Throw New FileNotFoundException("Could not find CSV file.", filepath)


    Dim tempFile As String = Path.GetTempFileName()

    Using reader As New StreamReader(filepath)
        Using writer As New StreamWriter(tempFile)
            Dim line As String = Nothing
            line = reader.ReadLine()
            While Not line Is Nothing

                If Not line.Equals(" ") Then writer.WriteLine(line)

                line = reader.ReadLine()
            End While
        End Using
    End Using


    File.Delete(filepath)
    File.Move(tempFile, filepath)
End Sub

I've tried using SSIS, but it encounters the EOF unexpected error.

What am I doing wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

计㈡愣 2024-08-16 11:44:45

如果您将整个文件读入字符串变量(使用 reader.ReadToEnd()),您会得到整个文件吗?或者您只是将数据获取到那些幻象空值?

您是否尝试过使用 Reader.ReadBlock() 函数来尝试读取超过文件长度?

If you read the entire file into a string variable (using reader.ReadToEnd()) do you get the whole thing? or are you just getting the data up to those phantom nulls?

Have you tried using the Reader.ReadBlock() function to try and read past the file length?

ま柒月 2024-08-16 11:44:45

在我们公司,我们每周进行数百次进口。如果文件未按照我们自动化流程同意的正确格式发送,我们会将其退回给发件人。如果最后一行错误,则不应处理该文件,因为它可能丢失信息或以其他方式损坏。

At our company we do hundreds of imports every week. If a file is not sent in the correct, agreed to format for our automated process, we return it to the sender. If the last line is wrong, the file should not be processed because it might be missing information or in some other way corrupt.

飘然心甜 2024-08-16 11:44:45

避免该错误的一种方法是使用 ReadAllLines,然后处理文件行数组,而不是处理文件。这也比 StreamReader 效率高很多。

Dim fileLines() As String
fileLines = File.ReadAllLines("c:\tmp.csv")
...
for each line in filelines
  If trim(line) <> "" Then writer.WriteLine(line)
next line

您还可以使用将输出行保存在相同或不同的字符串数组中,并使用 File.WriteAllLines 一次性写入文件。

One way to avoid the error is to use ReadAllLines, then process the array of file lines instead of progressing through the file. This is also a lot more efficient than streamreader.

Dim fileLines() As String
fileLines = File.ReadAllLines("c:\tmp.csv")
...
for each line in filelines
  If trim(line) <> "" Then writer.WriteLine(line)
next line

You can also use save the output lines in the same or a different string array and use File.WriteAllLines to write the file all at once.

遥远的绿洲 2024-08-16 11:44:45

您可以尝试使用内置的 .Net 对象来读取制表符分隔的文件。它是 Microsoft.VisualBasic.FileIO.TextFileParser

You could try the built-in .Net object for reading tab-delimited files. It is Microsoft.VisualBasic.FileIO.TextFileParser.

感受沵的脚步 2024-08-16 11:44:45

这是使用位数组解决的,一次检查一位是否有可疑位。

This was solved using a bit array, checking one bit at a time for the suspect bit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文