linq commitchanges 内存不足

发布于 2024-10-04 14:17:27 字数 1534 浏览 9 评论 0原文

我有一个大约有 180,000 条记录的数据库。我正在尝试将 pdf 文件附加到每条记录中。每个 pdf 大小约为 250 kb。然而，大约一分钟后，我的程序开始占用大约 1 GB 的内存，我必须停止它。我尝试这样做，以便在更新后删除对每个 linq 对象的引用，但这似乎没有帮助。怎样才能把引用说清楚呢？

感谢您的帮助

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

原文

I have a database with about 180,000 records. I'm trying to attach a pdf file to each of those records. Each pdf is about 250 kb in size. However, after about a minute my program starts taking about about a GB of memory and I have to stop it. I tried doing it so the reference to each linq object is removed once it's updated but that doesn't seem to help. How can I make it clear the reference?

Thanks for your help

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

路弥 2024-10-11 14:17:27

我建议您使用 Take（在调用ToList之前）一次处理特定数量的项目。阅读（比如说）10，在全部上设置PDFContent，调用SubmitChanges，然后重新开始。（我不确定此时是否应该从新的 DataContext 开始，但这样做可能是最干净的。）

顺便说一句，读取文件内容的代码是至少以几种方式损坏 - 但使用 File.ReadAllBytes。

此外，您处理列表逐渐缩小的方式确实效率很低 - 在获取 180,000 条记录后，您将构建一个包含 179,999 条记录的新列表，然后构建另一个包含 179,998 条记录的列表，依此类推。

回复收藏 0 原文

哥，最终变帅啦 2024-10-11 14:17:27

DataContext 是否将 ObjectTrackingEnabled 设置为 true（默认值）？如果是这样，那么它将尝试保留其接触的所有数据的记录，从而阻止垃圾收集器收集其中的任何数据。

如果是这样，您应该能够通过定期处理 DataContext 并创建新的 DataContext 或关闭对象跟踪来解决这种情况。

回复收藏 0 原文

乖乖公主 2024-10-11 14:17:27

好的。为了使用最少的内存，我们必须以块的形式更新数据上下文。我在下面放置了示例代码。可能有语法错误，因为我使用记事本输入它。

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

为了进一步优化速度，您可以将 recordID 作为匿名类型从 allitems 获取到数组中，并为该 PDF 字段设置 DelayLoading。

OK. To use the smallest amount of memory we have to update the datacontext in blocks. I've put a sample code below. Might have sytax errors since I'm using notepad to type it in.

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

To Further optimise the speed you can get the recordID's into an array from allitems as an anonymous type, and set DelayLoading on for that PDF field.

回复收藏 0 原文

~没有更多了~