什么可能导致性能提高？ GC 时间、池化

发布于 2024-10-31 07:59:04 字数 939 浏览 3 评论 0原文

我们的多线程应用程序会执行冗长的计算循环。平均而言，完成一个完整周期大约需要 29 秒。在此期间，.NET 性能计数器 % 时间GC 测量为 8.5%。全部由 Gen 2 系列制成。

为了提高性能，我们为大型对象实现了一个池。我们实现了 100% 的重用率。现在整个周期平均只需 20 秒。 “GC 时间百分比”显示介于 0.3...0.5% 之间。现在 GC 只进行 Gen 0 回收。

假设池化已有效实现，并忽略执行所需的额外时间。我们的性能提高了大约 33%。这与之前 8.5% 的 GC 值有何关系？

我有一些假设，希望能够得到确认、调整和修正：

1）“GC 中的时间”（如果我没读错的话）确实衡量了 2 个时间跨度的关系：

2 个 GC 周期之间的时间和
用于最后一个完整的 GC 周期，该值包含在第一个跨度中。

第二个时间跨度中不包括停止和重新启动阻塞 GC 的工作线程的开销。但怎么会占到总执行时间的 20%呢？

2）频繁阻塞线程进行GC可能会引起线程之间的争用？这只是一个想法。我无法通过 VS 并发分析器确认这一点。

3) 与此相反，可以确认非池化应用程序的页面未命中数（性能计数器：内存 -> 页面错误/秒）明显高于具有低 GC 的应用程序（每秒 25.000 个）速率（每秒 200 个）。我可以想象，这也会带来巨大的进步。但什么可以解释这种行为呢？是不是因为频繁的分配导致使用虚拟内存地址空间中更大的区域，因此更难保留到物理内存中？如何测量才能确认这是这里的原因？

顺便说一句：GCSettings.IsServerGC = false，.NET 4.0，64 位，在 Win7、4GB、Intel i5 上运行。（对这个大问题感到抱歉..；）

原文

Our multithreaded application does a lengthy computational loop. On average it takes about 29 sec for it to finish one full cycle. During that time, the .NET performance counter % time in GC measures 8.5 %. Its all made of Gen 2 collections.

In order to improve performance, we implemented a pool for our large objects. We archieved a 100% reusement rate. The overall cycle now takes only 20 sec on average. The "% time in GC" shows something between 0.3...0.5%. Now the GC does only Gen 0 collections.

Lets assume, the pooling is efficiently implemented and neglect the additional time it takes to execute. Than we got a performance improvement of roughly 33 percent. How does that relate to the former value for GC of 8.5%?

I have some assumptions, which I hope can be confirmed, adjusted and amended:

1) The "time in GC" (if I read it right) does measure the relation of 2 time spans:

Time between 2 GC cycles and
Time used for the last full GC cycle, this value is included into the first span.

What is not included into the second time span, would be the overhead of stopping and restarting the worker threads for the blocking GC. But how could that be as large as 20% of the overall execution time?

2) Frequently blocking the threads for GC may introduce contention between the treads? It is just a thought. I could not confirm that via the VS concurrency profiler.

3) In contrast to that, it could be confirmed that the number of page misses (performance counter: Memory -> Page Faults/sec) is significantly higher for the unpooled application (25.000 per second) than for the application with the low GC rate (200 per second). I could imagine, this would cause the great improvement as well. But what could explain that behaviour? Is it, because frequent allocations are causing a much larger area from the virtual memory address space to be used, which therefore is harder to keep into the physical memory? And how could that be measured to confirm as the reason here?

BTW: GCSettings.IsServerGC = false, .NET 4.0, 64bit, running on Win7, 4GB, Intel i5. (And sorry for the large question.. ;)

分享到QQ

分享到微博