JVM CMS 垃圾收集问题

发布于 2024-09-03 02:34:35 字数 2011 浏览 6 评论 0原文

我在使用并发标记-清除收集器的应用程序的 GC 日志文件中看到以下症状:

4031.248: [CMS-concurrent-preclean-start]
4031.250: [CMS-concurrent-preclean: 0.002/0.002 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
4031.250: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 4036.346: [CMS-concurrent-abortable-preclean: 0.159/5.096 secs] [Times: user=0.00 sys=0.01, real=5.09 secs] 
4036.346: [GC[YG occupancy: 55964 K (118016 K)]4036.347: [Rescan (parallel) , 0.0641200 secs]4036.411: [weak refs processing, 0.0001300 secs]4036.411: [class unloading, 0.0041590 secs]4036.415: [scrub symbol & string tables, 0.0053220 secs] [1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs] [Times: user=0.08 sys=0.00, real=0.08 secs] 

预清理进程不断中止。我尝试将 CMSMaxAbortablePrecleanTime 从默认值 5 秒调整为 15 秒,但这没有帮助。当前的 JVM 选项如下...

Djava.awt.headless=true
 -Xms512m
 -Xmx512m
 -Xmn128m
 -XX:MaxPermSize=128m
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:BiasedLockingStartupDelay=0
 -XX:+DoEscapeAnalysis
 -XX:+UseBiasedLocking
 -XX:+EliminateLocks
 -XX:+CMSParallelRemarkEnabled
 -verbose:gc
 -XX:+PrintGCTimeStamps
 -XX:+PrintGCDetails
 -XX:+PrintHeapAtGC
 -Xloggc:gc.log
 -XX:+CMSClassUnloadingEnabled
 -XX:+CMSPermGenPrecleaningEnabled
 -XX:CMSInitiatingOccupancyFraction=50
 -XX:ReservedCodeCacheSize=64m
 -Dnetworkaddress.cache.ttl=30
 -Xss128k

看来并发-可中止-预清理永远没有机会运行。我通读了 https://blogs.oracle.com/jonthecollector/entry/did_you_know有建议启用CMSScavengeBeforeRemark,但暂停的副作用似乎并不理想。有人可以提供任何建议吗?

另外,我想知道是否有人对 CMS GC 日志有很好的参考,特别是这一行:

[1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs]

不清楚这些数字所指的内存区域。 编辑找到了此http://www. sun.com/bigadmin/content/subscribed/cms_gc_logs.jsp

I'm seeing the following symptoms on an application's GC log file with the Concurrent Mark-Sweep collector:

4031.248: [CMS-concurrent-preclean-start]
4031.250: [CMS-concurrent-preclean: 0.002/0.002 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
4031.250: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 4036.346: [CMS-concurrent-abortable-preclean: 0.159/5.096 secs] [Times: user=0.00 sys=0.01, real=5.09 secs] 
4036.346: [GC[YG occupancy: 55964 K (118016 K)]4036.347: [Rescan (parallel) , 0.0641200 secs]4036.411: [weak refs processing, 0.0001300 secs]4036.411: [class unloading, 0.0041590 secs]4036.415: [scrub symbol & string tables, 0.0053220 secs] [1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs] [Times: user=0.08 sys=0.00, real=0.08 secs] 

The preclean process keeps aborting continously. I've tried adjusting CMSMaxAbortablePrecleanTime to 15 seconds, from the default of 5, but that has not helped. The current JVM options are as follows...

Djava.awt.headless=true
 -Xms512m
 -Xmx512m
 -Xmn128m
 -XX:MaxPermSize=128m
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:BiasedLockingStartupDelay=0
 -XX:+DoEscapeAnalysis
 -XX:+UseBiasedLocking
 -XX:+EliminateLocks
 -XX:+CMSParallelRemarkEnabled
 -verbose:gc
 -XX:+PrintGCTimeStamps
 -XX:+PrintGCDetails
 -XX:+PrintHeapAtGC
 -Xloggc:gc.log
 -XX:+CMSClassUnloadingEnabled
 -XX:+CMSPermGenPrecleaningEnabled
 -XX:CMSInitiatingOccupancyFraction=50
 -XX:ReservedCodeCacheSize=64m
 -Dnetworkaddress.cache.ttl=30
 -Xss128k

It appears the concurrent-abortable-preclean never gets a chance to run. I read through https://blogs.oracle.com/jonthecollector/entry/did_you_know which had a suggestion of enabling CMSScavengeBeforeRemark, but the side effects of pausing did not seem ideal. Could anyone offer up any suggestions?

Also I was wondering if anyone had a good reference for grokking the CMS GC logs, in particular this line:

[1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs]

Not clear on what memory regions those numbers are referring to.
Edit Found a link to this http://www.sun.com/bigadmin/content/submitted/cms_gc_logs.jsp

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

时常饿 2024-09-10 02:34:35

[时间:user=0.00 sys=0.01,real=5.09 秒]

我会尝试调查为什么 CMS-concurrent-abortable-preclean-start 在 5 秒内既没有获得用户也没有获得系统 CPU 时间。

我的建议是从“干净”的 JVM CMS 启动标志开始,然后

-Djava.awt.headless=true
-Xms512m
-Xmx512m
-Xmn128m
-Xss128k
-XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC
-XX:+HeapDumpOnOutOfMemoryError
-Xloggc:gc.log
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintHeapAtGC

检查问题是否重现并一次不断调整一个参数。

[Times: user=0.00 sys=0.01, real=5.09 secs]

I would try investigate why CMS-concurrent-abortable-preclean-start doesn't get neither user nor sys CPU time in 5 seconds.

My suggestion is starting from a 'clean' JVM CMS startup flags like

-Djava.awt.headless=true
-Xms512m
-Xmx512m
-Xmn128m
-Xss128k
-XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC
-XX:+HeapDumpOnOutOfMemoryError
-Xloggc:gc.log
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintHeapAtGC

then check if the problem reproduces and keep tweaking one parameter at a time.

海未深 2024-09-10 02:34:35

正如有人已经提到的,第一步是增加 CMSInitiatingOccupancyFraction。

作为第二步,我将使用标志 -XX:-PrintTenuringDistribution 并确保不会过早地从年轻一代升级到老一代。这将导致老到年轻的引用,这可能会导致更长的可中止预清理阶段。
如果出现这种过早提升的情况,可以尝试调整eden空间和survivor空间的比例。

As someone has already mentioned, the first step would be to increase the CMSInitiatingOccupancyFraction.

As a second step, I would use the flag -XX:-PrintTenuringDistribution and make sure that there is no premature promotion from the young generation to the old one. This would lead to old-to-young references which might lead to a longer abortable preclean phase.
If there is such a premature promotion, try to adjust the ratio between the eden and the survior spaces.

尛丟丟 2024-09-10 02:34:35

这里有一个很好的解释这种现象:

引用:

<块引用>

所以当系统负载很轻时(这意味着不会有
次要GC),预清理总是会超时,而完整GC总是会超时
失败。 cpu很浪费。

不会失败。并行性会降低(即效率较低,并且会
有更长的暂停时间,减少工作量)。

总而言之:这似乎是正常操作 - 线程只是等待小GC发生 5 秒,但是当这没有发生时,没有什么大问题:JVM 选择不同的(效率较低)策略来继续与GC。

There is a good explanation here about this phenomenon:

Quote:

So when the system load is light(which means there will be no
minor gc), precleaning will always time out and full gc will always
fail. cpu is waste.

It won't fail. It'll be less parallel (i.e. less efficient, and would
have a longer pause time, for lesser work).

So all in all: this seems to be normal operation - the thread just waits for a minor GC to happen for 5 seconds, but there is no big issue when this does not happen: the JVM chooses a different (less efficient) strategy to continue with the GC.

丘比特射中我 2024-09-10 02:34:35

对于我正在使用的服务,我添加了:

-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=80

这将 JVM 配置为仅在 80% 已满后才开始标记,值得一试。

For the service I'm using I added:

-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=80

This configures the JVM to start the marking only after 80% is full and it's worth giving it a try.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文