处理大字符串与内存使用有关的技巧
如何强制收缩数据表和/或列表以便有效释放内存?目前,我正在每次循环迭代时从数据集中删除已处理的行,但我不确定内存是否被释放。
for (int i = m_TotalNumberOfLocalRows - 1; i >= 0; i--)
{
dr = dt.Rows[i];
// Do stuff
dt.Rows.Remove(dr);
}
如果这不会缩小数据表在内存中的占用空间,我可以使用列表。我仅使用 DataTable 作为 DataRows 的存储,我可以使用任何内存不足且能够在每 x 迭代释放内存的集合或存储机制。
谢谢。
编辑: 在阅读了什么是一些好的.NET分析器?后进行了一些内存分析之后,我我们发现内存的主要消耗者是字符串。
在此过程中我们会进行大量的用户输出,内存消耗大约在 170MB-230MB 之间循环,峰值约为 300MB。我使用初始大小为 20971520 的 StringBuilder 来保存正在发生的事情的输出/日志,在处理完记录总数的百分之一后,我将 DevExpress MemoEdit 控件的 Text 属性设置为 StringBuilder.ToString ()。我发现这种方法比将 StringBuilder.ToString() 附加到 MemoEdit.Text 更快(显然 StringBuilder 的逻辑在附加和设置 MemoEdit.Text 之间是不同的)
我还发现,而不是重新创建StringBuilder(20971520) 它更容易占用内存,并且执行速度更快,只需 StringBuilder.Remove(0, StringBuilder.Length)
您是否可以分享一些技巧来提高处理大型字符串时的性能(日志它写出的包含日志的文件大约为 12.2MB,大约有 30 000 条记录)?
注意:我已经更改了问题的标题和标签。
旧标题:如何强制收缩数据表和/或列表以释放内存?
旧标签:列出数据表c#.net内存
How can I force a shrink of a DataTable and/or List so that I can free memory efficiently? I am currently removing the processed row from the DataSet every iteration of the loop, but I'm not sure if the memory is being released.
for (int i = m_TotalNumberOfLocalRows - 1; i >= 0; i--)
{
dr = dt.Rows[i];
// Do stuff
dt.Rows.Remove(dr);
}
If this doesn't shrink the DataTable's footprint in memory I can use a List. I'm only using the DataTable as a store for the DataRows, I could use whatever collection or storage mechanism that would be low on memory and be able to release memory every x iterations.
Thank you.
Edit:
After doing some memory profiling after reading What Are Some Good .NET Profilers? I've discovered that the main consumer of memory are strings.
We are doing lots of user output during this process, and the memory consumption is cycling between approximately 170MB-230MB, peaking at around 300MB. I'm using a StringBuilder with an initial size of 20971520 to hold the output/log of what is happening and after one percent of the total number of records have been processed I'm setting a DevExpress MemoEdit control's Text property to the StringBuilder.ToString(). I've found that this method is quicker than appending the StringBuilder.ToString() to the MemoEdit.Text (obviously the logic w.r.t. the StringBuilder is different between appending and setting the MemoEdit.Text)
I've also found that instead of recreating the StringBuilder(20971520) it's easier on memory and quicker to execute to just StringBuilder.Remove(0, StringBuilder.Length)
Are there any tips that you could share to improve performance when working with large strings (the log file that it written out that contains the log is approximately 12.2MB for about 30 000 records)?
Note: I have changed the title of the question and the tags.
Old title:How can I Force a Shrink of a DataTable and/or a List to Release Memory?
Old tags: list datatable c# .net memory
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
除非您遇到内存问题,否则不要尝试通过调用垃圾收集器手动释放它。运行时会为您处理它,并且 99% 的时间比您尝试猜测最佳时间的效率更高。
您必须记住的是,当您调用 GC.Collect() 时,它会针对垃圾收集的所有级别运行并“整理”所有需要释放的对象。您很可能会花费处理器时间等来处理当时不需要完成的事情。
如果您绝对需要,命令是 GC.Collect()
http://msdn.microsoft.com/en-us/library/xe0c2357.aspx
http://www.developer.com/net/csharp/article.php/3343191/C-Tip-Forcing-Garbage -Collection-in-NET.htm
Unless you are having a problem with memory, don't try and manually free it by calling the garbage collector. The run time will handle it for you and 99% of the time be more efficient at it than you will by trying to guess when the optimal time is.
What you have to remember is that when you call GC.Collect(), it runs against all the levels of the Garbage Collection and "tidies up" all objects that need to need to be freed. You will most likely be spending processor time etc. handling something that doesn't need to be done at that point in time.
If you absolutely have to the command is
GC.Collect()
http://msdn.microsoft.com/en-us/library/xe0c2357.aspx
http://www.developer.com/net/csharp/article.php/3343191/C-Tip-Forcing-Garbage-Collection-in-NET.htm
尝试强制执行垃圾收集器传递:
如果您想确保在代码继续执行之前所有对象都已完成,请
在 GC.Collect() 之后立即
调用编辑: 正如下面评论中提到的那样,人们普遍认为直接调用垃圾收集器是一种不好的做法。尽管如此,这仍然可以达到释放已删除行的未使用内存的目标。
Try forcing a garbage collector pass:
If you want to make sure, that all objects are finalized before your code execution continues, call
right after GC.Collect()
EDIT: As people mentioned in the comments below, it is widely considered a bad practice to call the Garbage Collector directly. Nevertheless, this sould achieve the goal of freeing the unnused memory of your deleted rows.
坚持使用数据表并删除不必要的行,如您的示例所示。
通过这样做,您无法控制内存使用:这是由 CLR 垃圾收集器完成的。
Stick to the DataTable and remove unnecessary rows as in your example.
By doing this you can't control memory usage: this is done by the CLR Garbage Collector.
您是否明确需要直接管理此问题?垃圾收集器会为您管理此操作。
Do you have an explicit need to manage this directly? The garbage collector manages this for you.