VB.Net 替换大型文本文件中的特定值

发布于 2024-11-12 21:57:40 字数 972 浏览 6 评论 0原文

我有一些大型 csv 文件(每个 1.5gb),我需要在其中替换特定值。我目前使用的方法非常慢,我相当确定应该有一种方法可以加快速度,但我只是没有足够的经验来知道我应该做什么。这是我的第一篇文章,我尝试搜索相关内容,但没有找到任何内容。任何帮助将不胜感激。

我的另一个想法是将文件分成块,以便我可以将整个文件读入内存,在那里进行所有替换,然后输出到合并文件。我尝试了这个,但我的方法实际上看起来比我当前的方法慢。

谢谢!

    Sub Main()
    Dim fName As String = "2009.csv"
    Dim wrtFile As String = "2009.1.csv"
    Dim lRead
    Dim lwrite As String
    Dim strRead As New System.IO.StreamReader(fName)
    Dim strWrite As New System.IO.StreamWriter(wrtFile)
    Dim bulkWrite As String

    bulkWrite = ""
    Do While strRead.Peek <> -1
        lRead = Split(strRead.ReadLine(), ",")
        If lRead(9) = "5MM+" Then lRead(9) = "5000000"
        If lRead(9) = "1MM+" Then lRead(9) = "1000000"

        lwrite = ""
        For i = LBound(lRead) To UBound(lRead)
            lwrite = lwrite & lRead(i) & ","
        Next
        strWrite.WriteLine(lwrite)
     Loop

    strRead.Close()
    strWrite.Close()
End Sub

I have some large csv files (1.5gb each) where I need to replace specific values. The method I'm currently using is terribly slow and I'm fairly certain that there should be a way to speed this up but I'm just not experienced enough to know what I should be doing. This is my first post and I tried searching through to find something relevant but didn't come across anything. Any help would be appreciated.

My other thought would be to break the file into chunks so that I can read the entire thing into memory, do all of the replacements there and then output to a consolidated file. I tried this but the way I did it actually ended up seeming slower than my current method.

Thanks!

    Sub Main()
    Dim fName As String = "2009.csv"
    Dim wrtFile As String = "2009.1.csv"
    Dim lRead
    Dim lwrite As String
    Dim strRead As New System.IO.StreamReader(fName)
    Dim strWrite As New System.IO.StreamWriter(wrtFile)
    Dim bulkWrite As String

    bulkWrite = ""
    Do While strRead.Peek <> -1
        lRead = Split(strRead.ReadLine(), ",")
        If lRead(9) = "5MM+" Then lRead(9) = "5000000"
        If lRead(9) = "1MM+" Then lRead(9) = "1000000"

        lwrite = ""
        For i = LBound(lRead) To UBound(lRead)
            lwrite = lwrite & lRead(i) & ","
        Next
        strWrite.WriteLine(lwrite)
     Loop

    strRead.Close()
    strWrite.Close()
End Sub

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

故人爱我别走 2024-11-19 21:57:43

您正在分裂和合并,这可能需要一些时间。

为什么不直接阅读文本行呢?然后用适当的值替换任何出现的“5MM+”和“1MM+”,然后写入该行。

 Do While ...
    s = strRead.ReadLine();
    s = s.Replace("5MM+", "5000000")
    s = s.Replace("1MM+", "1000000")
    strWrite(s);
 Loop

You are splitting and the combining, which can take some time.

Why not just read the line of text. Then replace any occurance of "5MM+" and "1MM+" with the approiate value and then write the line.

 Do While ...
    s = strRead.ReadLine();
    s = s.Replace("5MM+", "5000000")
    s = s.Replace("1MM+", "1000000")
    strWrite(s);
 Loop
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文