依赖 .NET 自动垃圾收集器是不好的做法吗?
可以创建大量内存密集型对象,然后放弃对它们的引用。例如,我可能想要从数据库下载并操作一些数据,我将进行 100 次单独的下载和处理迭代。我可以声明一个 DataTable 变量一次,然后对于每个查询,使用构造函数将其重置为新的 DataTable 对象,并放弃内存中的旧 DataTable 对象。
DataTable 类具有简单的内置方法来释放其使用的内存,包括 Rows.Clear() 和 .Dispose()。因此,我可以在每次迭代结束时执行此操作,然后再将变量设置为新的 DataTable 对象。或者我可以忘记它,让 CLR 垃圾收集器为我做这件事。垃圾收集器似乎非常有效,因此最终结果应该是相同的。当您不需要内存密集型对象时,显式处置它们是否“更好”(但添加代码来执行此操作),或者只是依赖垃圾收集器为您完成所有工作(您受到GC算法,但你的代码更小)?
根据要求,这里是说明回收的 DataTable 变量示例的代码:
// queryList is list of 100 SELECT queries generated somewhere else.
// Each of them returns a million rows with 10 columns.
List<string> queryList = GetQueries(@"\\someserver\bunch-o-queries.txt");
DataTable workingTable;
using (OdbcConnection con = new OdbcConnection("a connection string")) {
using (OdbcDataAdapter adpt = new OdbcDataAdapter("", con)) {
foreach (string sql in queryList) {
workingTable = new DataTable(); // A new table is created. Previous one is abandoned
adpt.SelectCommand.CommandText = sql;
adpt.Fill(workingTable);
CalcRankingInfo(workingTable);
PushResultsToAnotherDatabase(workingTable);
// Here I could call workingTable.Dispose() or workingTable.Rows.Clear()
// or I could do nothing and hope the garbage collector cleans up my
// enormous DataTable automatically.
}
}
}
It's possible to create lots of memory-intensive objects and then abandon references to them. For example, I might want to download and operate on some data from a database, and I will do 100 separate download and processing iterations. I could declare a DataTable variable once, and for each query reset it to a new DataTable object using a constructor, abondoning the old DataTable object in memory.
The DataTable class has easy built-in ways to release the memory it uses, including Rows.Clear() and .Dispose(). So I could do this at the end of every iteration before setting the variable to a new DataTable object. OR I could forget about it and just let the CLR garbage collector do this for me. The garbage collector seems to be pretty effective so the end result should be the same either way. Is it "better" to explicitly dispose of memory-heavy objects when you don't need them, (but add code to do this) or just depend on the garbage collector to do all the work for you (you are at the mercy of the GC algorithm, but your code is smaller)?
Upon request, here is code illustrating the recycled DataTable variable example:
// queryList is list of 100 SELECT queries generated somewhere else.
// Each of them returns a million rows with 10 columns.
List<string> queryList = GetQueries(@"\\someserver\bunch-o-queries.txt");
DataTable workingTable;
using (OdbcConnection con = new OdbcConnection("a connection string")) {
using (OdbcDataAdapter adpt = new OdbcDataAdapter("", con)) {
foreach (string sql in queryList) {
workingTable = new DataTable(); // A new table is created. Previous one is abandoned
adpt.SelectCommand.CommandText = sql;
adpt.Fill(workingTable);
CalcRankingInfo(workingTable);
PushResultsToAnotherDatabase(workingTable);
// Here I could call workingTable.Dispose() or workingTable.Rows.Clear()
// or I could do nothing and hope the garbage collector cleans up my
// enormous DataTable automatically.
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
@贾斯汀
依靠 GC 来帮你清理是一种可怕的做法。不幸的是你推荐这样做。这样做很可能会导致内存泄漏,是的,在 .NET 中至少有 22 种方法可以“泄漏内存”。我曾为大量客户工作过,诊断托管和非托管内存泄漏,为他们提供解决方案,并向多个 .NET 用户组介绍了高级 GC 内部结构以及内存管理如何从 GC 和 CLR 内部进行工作。
@OP:
您应该在 DataTable 上调用 Dispose(),并在循环结束时将其显式设置为 null。这明确告诉 GC 您已经完成了它,并且不再有对它的根引用。 DataTable 由于其尺寸较大而被放置在 LOH 上。不这样做很容易导致 LOH 碎片化,从而导致 OutOfMemoryException。请记住,LOH 永远不会被压缩!
有关更多详细信息,请参阅我的回答
什么如果我不对笔对象调用 Dispose,会发生这种情况吗?
@Henk - IDisposable 和内存管理之间有关系; IDisposable 允许半显式释放资源(如果实现正确)。资源总是具有某种与其关联的托管内存和通常非托管内存。
关于 Dispose() 和 IDisposable,需要注意以下几点:
IDisposable 提供对托管内存和非托管内存的处置。非托管内存的处置应在 Dispose 方法中完成,并且您应该为 IDisposable 实现提供 Finalizer。
GC 不会为您调用 Dispose。
如果不调用Dispose(),GC会将其发送到Finalization
队列,最终再次进入 f-reachable 队列。最终确定
使一个对象在2个集合中存活,这意味着它将
如果属于 Gen0,则晋升为 Gen1;如果属于 Gen1,则晋升为 Gen2。
在你的例子中,该对象位于 LOH 上,因此它会一直存在,直到一个完整的
GC(所有代加上 LOH)执行两次,其中,
在“健康”的 .NET 应用程序下,单个完整集合是
执行约。每 100 个收藏中就有 1 个。由于有很多
LOH 堆和 GC 上的压力,根据您的实现,完全
GC 会更频繁地触发。这对于性能来说是不可取的
原因是因为完整的 GC 需要更多的时间才能完成。然后有
还取决于您运行的 GC 类型以及是否
您正在使用 LatencyModes(对此要非常小心)。即使
您正在运行后台 GC(这已取代 CLR 中的并发 GC
4.0),临时集合(Gen0 和 Gen1)仍然会阻塞/挂起线程。这意味着在此期间不能执行任何分配
时间。您可以使用 PerfMon 来监视内存的行为
您的应用程序上的利用率和 GC 活动。请注意,GC
计数器仅在 GC 发生后更新。为了
有关 GC 版本的其他信息,请参阅我对
的回复
确定哪个垃圾收集器正在运行 .
Dispose() 立即释放与您的对象关联的资源。是的,GC 是不确定的,但调用 Dispose() 不会不会触发 GC!
Dispose() 让 GC 知道您已完成该对象的处理,并且可以在该对象所在的一代的下一个收集中回收其内存。如果对象存在于 Gen2 或 LOH 中,则在发生 Gen0 或 Gen1 回收时,该内存将不会被回收!
Finalizer 在 1 个线程上运行(无论正在使用的 GC 版本以及计算机上的逻辑处理器数量如何)。如果您在 Finalization 和 f-reachable 队列中坚持很多,那么您只有 1 个线程处理所有准备好的内容对于 Finalization;你知道你的性能在哪里...
有关如何正确实现 IDisposable 的信息,请参阅我的博客文章:
如何正确实现 IDisposable模式?
@Justin
It is a horrible practice to rely on the GC to cleanup for you. It's unfortunate that you are recommending that. Doing so, can very likely lead you down the path of have a memory leak and yes, there are at least 22 ways you can "leak memory" in .NET. I've worked at a huge number of clients diagnosing both managed and unmanaged memory leaks, providing solutions to them, and have presented at multiple .NET user groups on Advanced GC Internals and how memory management works from the inside of the GC and CLR.
@OP:
You should call Dispose() on the DataTable and explicitly set it equal to null at the end of the loop. This explicitly tells the GC that you are done with it and there are no more rooted references to it. The DataTable is being placed on the LOH because of its large size. Not doing this can easily fragment your LOH resulting in an OutOfMemoryException. Rememeber that the LOH is never compacted!
For additional details, please refer to my answer at
What happens if I don't call Dispose on the pen object?
@Henk - There is a relationship between IDisposable and memory management; IDisposable allows for an semi-explicit release of resources (if implemented correctly). And resources always have some sort of managed and typically unmanaged memory associated with them.
A couple of things to note about Dispose() and IDisposable here:
IDisposable provides for disposal of both Managed and Unmanaged memory. Disposal of Unmanaged memory should be done in the Dispose Method and you should provide a Finalizer for your IDisposable implementation.
The GC does not call Dispose for you.
If you don't call Dispose(), the GC sends it to the Finalization
queue, and ultimately again to the f-reachable queue. Finalization
makes an object survive 2 collections, which means it will be
promoted to Gen1 if it was in Gen0, and to Gen2 if it was in Gen1.
In your case, the object is on the LOH, so it survives until a full
GC (all generations plus the LOH) is performed twice which,
under a "healthy" .NET app, a single full collection is
performed approx. 1 in every 100 collections. Since there is lots of
pressure on the LOH Heap and GC, based on your implementation, full
GC's will fire more often. This is undesirable for performance
reasons since full GC's take much more time to complete. Then there
is also a dependency on what kind of GC you're running under and if
you are using LatencyModes (be very careful with this). Even if
you're running Background GC (this has replaced Concurrent GC in CLR
4.0), the ephemeral collection (Gen0 and Gen1) still blocks/suspends threads. Which means no allocations can be performed during this
time. You can use PerfMon to monitor the behavior of the memory
utilization and GC activity on your app. Please note that the GC
counters are updated only after a GC has taken place. For
additional info on versions of GC, see my response to
Determining which garbage collector is running.
Dispose() immediately releases the resources associated with your object. Yes, GC is non-deterministic, but calling Dispose() does not trigger a GC!
Dispose() lets the GC know that you are done with this object and its memory can be reclaimed at the next collection for the generation where that object lives. If the object lives in Gen2 or on the LOH, that memory will not be reclaimed if either a Gen0 or Gen1 collection takes place!
The Finalizer runs on 1 thread (regardless of version of GC that is being used and the number of logical processors on the machine. If you stick alot in the Finalization and f-reachable queues, you only have 1 thread processing everything ready for Finalization; your performance goes you know where...
For info on how to properly implement IDisposable, please refer to my blog post:
How do you properly implement the IDisposable pattern?
好吧,是时候澄清一下了(因为我原来的帖子有点混乱)。
IDisposable 与内存管理无关。
IDisposable
允许对象清理它可能持有的任何本机资源。如果对象实现了IDisposable
,则应确保使用using
块或在完成后调用Dispose()
。至于定义内存密集型对象然后丢失对它们的引用,这就是垃圾收集器的工作原理。这是一件好事。让它发生并让垃圾收集器完成它的工作。
...所以,回答你的问题,不。依赖 .NET 垃圾收集器并不是一个坏习惯。事实上恰恰相反。
Ok, time to clear things up a bit (since my original post was a little muddy).
IDisposable has nothing to do with Memory Management.
IDisposable
allows an object to clean up any native resources it might be holding on to. If an object implementsIDisposable
, you should be sure to either use ausing
block or callDispose()
when you're finished with it.As for defining memory-intensive objects and then losing the references to them, that's how the Garbage Collector works. It's a good thing. Let it happen and let the Garbage Collector do its job.
...so, to answer your question, No. It is not a bad practice to depend on the .NET Garbage Collector. Quite the opposite in fact.
我也同意戴夫的帖子。您应该始终处理并释放数据库连接,即使您正在使用的框架有不需要的文档。
作为一名曾使用过 MS SQL、Oracle、Sybase/SAP 和 MYSQL 的 DBA,我被带去研究神秘的锁定和内存泄漏,这些问题被归咎于数据库,而事实上,问题是因为开发人员没有当他们使用完它们后,关闭并销毁它们的连接对象。我什至见过一些应用程序让空闲连接保持打开状态好几天,当您的数据库在 SQL Server 2012 中进行集群、镜像和始终在线恢复组时,这确实会让事情变得更糟。
当我参加第一门 .Net 课程时,讲师教授我们只在您使用数据库连接时保持数据库连接打开。进去,完成你的工作,然后出去。这一变化使我帮助优化的几个系统变得更加可靠。它还释放 RDBMS 中的连接内存,为缓冲区 IO 提供更多内存。
I also agree with Dave's post. You should always dispose and release your database connections, even if the framework you are working has documentation that it is not needed.
As a DBA who has worked with MS SQL, Oracle, Sybase/SAP, and MYSQL, I have been brought in to work on mysterious locking and memory leaking that was blamed on the database when in fact, the issue was because the developer did not close and destroy their connection objects when they were done with them. I've even seen apps that leave idle connections open for days and it can really make things bad when your database is clustered, mirrored, and with Always on recovery groups in SQL Server 2012.
When I took my first .Net class the instructor taught us to only keep database connections open while you are using them. Get in, get your work done and get out. This change has made several systems I have help optimize a lot more reliable. It also frees up connection memory in the RDBMS giving more ram to buffer IO.