linq commitchanges 内存不足

发布于 2024-10-04 14:17:27 字数 1534 浏览 0 评论 0原文

我有一个大约有 180,000 条记录的数据库。我正在尝试将 pdf 文件附加到每条记录中。每个 pdf 大小约为 250 kb。然而,大约一分钟后,我的程序开始占用大约 1 GB 的内存,我必须停止它。我尝试这样做,以便在更新后删除对每个 linq 对象的引用,但这似乎没有帮助。怎样才能把引用说清楚呢?

感谢您的帮助

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

I have a database with about 180,000 records. I'm trying to attach a pdf file to each of those records. Each pdf is about 250 kb in size. However, after about a minute my program starts taking about about a GB of memory and I have to stop it. I tried doing it so the reference to each linq object is removed once it's updated but that doesn't seem to help. How can I make it clear the reference?

Thanks for your help

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

路弥 2024-10-11 14:17:27

我建议您使用 Take调用ToList之前)一次处理特定数量的项目。阅读(比如说)10,在全部上设置PDFContent,调用SubmitChanges,然后重新开始。 (我不确定此时是否应该从新的 DataContext 开始,但这样做可能是最干净的。)

顺便说一句,读取文件内容的代码是至少以几种方式损坏 - 但使用 File.ReadAllBytes

此外,您处理列表逐渐缩小的方式确实效率很低 - 在获取 180,000 条记录后,您将构建一个包含 179,999 条记录的新列表,然后构建另一个包含 179,998 条记录的列表,依此类推。

I suggest you perform this in batches, using Take (before the call to ToList) to process a particular number of items at a time. Read (say) 10, set the PDFContent on all of them, call SubmitChanges, and then start again. (I'm not sure offhand whether you should start with a new DataContext at that point, but it might be cleanest to do so.)

As an aside, your code to read the contents of a file is broken in at least a couple of ways - but it would be simpler just to use File.ReadAllBytes in the first place.

Also, your way of handling the list gradually shrinking is really inefficient - after fetching 180,000 records, you're then building a new list with 179,999 records, then another with 179,998 records etc.

哥,最终变帅啦 2024-10-11 14:17:27

DataContext 是否将 ObjectTrackingEnabled 设置为 true(默认值)?如果是这样,那么它将尝试保留其接触的所有数据的记录,从而阻止垃圾收集器收集其中的任何数据。

如果是这样,您应该能够通过定期处理 DataContext 并创建新的 DataContext 或关闭对象跟踪来解决这种情况。

Does the DataContext have ObjectTrackingEnabled set to true (the default value)? If so, then it will try to keep a record of essentially all the data it touches, thus preventing the garbage collector from being able to collect any of it.

If so, you should be able to fix the situation by periodically disposing the DataContext and creating a new one, or turning object tracking off.

乖乖公主 2024-10-11 14:17:27

好的。为了使用最少的内存,我们必须以块的形式更新数据上下文。我在下面放置了示例代码。可能有语法错误,因为我使用记事本输入它。

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

为了进一步优化速度,您可以将 recordID 作为匿名类型从 allitems 获取到数组中,并为该 PDF 字段设置 DelayLoading。

OK. To use the smallest amount of memory we have to update the datacontext in blocks. I've put a sample code below. Might have sytax errors since I'm using notepad to type it in.

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

To Further optimise the speed you can get the recordID's into an array from allitems as an anonymous type, and set DelayLoading on for that PDF field.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文