Java 阻塞问题:为什么 JVM 会阻塞许多不同类/方法中的线程?

发布于 2024-09-28 19:14:10 字数 2737 浏览 0 评论 0原文

更新:这看起来像是内存问题。 3.8 Gb Hprof 文件表明,发生此“阻塞”时 JVM 正在转储堆。我们的运营团队发现该站点没有响应,进行了堆栈跟踪,然后关闭了该实例。我相信他们在堆转储完成之前关闭了该网站。日志没有错误/异常/问题证据——可能是因为 JVM 在生成错误消息之前就被终止了。

原始问题 我们最近遇到过这样的情况:对于最终用户来说,应用程序似乎挂起。我们在应用程序重新启动之前获得了堆栈跟踪,我发现了一些令人惊讶的结果:在 527 个线程中,463 个线程状态为 BLOCKED。

过去 过去阻塞线程通常有这样的问题: 1)一些明显的瓶颈:例如某些数据库记录锁或文件系统锁问题导致其他线程等待。 2) 所有被阻塞的线程都会阻塞在同一个类/方法上(例如 jdbc 或文件系统类)

异常数据 在这种情况下,我看到各种类/方法被阻止,包括jvm内部类,jboss类,log4j等,除了应用程序类(包括jdbc和lucene调用)

问题 什么会导致 JVM 阻塞 log4j.Hierarchy.getLogger、java.lang.reflect.Constructor.newInstance?显然有些资源“稀缺”,但哪种资源呢?

谢谢

堆栈跟踪摘录

http-0.0.0.0-80-417" daemon prio=6 tid=0x000000000f6f1800 nid=0x1a00 waiting for monitor entry [0x000000002dd5d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
                at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
                at java.lang.Class.newInstance0(Class.java:355)
                at java.lang.Class.newInstance(Class.java:308)
                at org.jboss.ejb.Container.createBeanClassInstance(Container.java:630)

http-0.0.0.0-80-451" daemon prio=6 tid=0x000000000f184800 nid=0x14d4 waiting for monitor entry [0x000000003843d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at java.lang.Class.getDeclaredMethods0(Native Method)
                at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
                at java.lang.Class.getMethod0(Class.java:2670)

"http-0.0.0.0-80-449" daemon prio=6 tid=0x000000000f17d000 nid=0x2240 waiting for monitor entry [0x000000002fa5f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.register(Http11Protocol.java:638)
                - waiting to lock <0x00000007067515e8> (a org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.createProcessor(Http11Protocol.java:630)


"http-0.0.0.0-80-439" daemon prio=6 tid=0x000000000f701800 nid=0x1ed8 waiting for monitor entry [0x000000002f35b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:261)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:242)
                at org.apache.log4j.LogManager.getLogger(LogManager.java:198)

Update: This looks like a memory issue. A 3.8 Gb Hprof file indicated that the JVM was dumping-its-heap when this "blocking" occurred. Our operations team saw that the site wasn't responding, took a stack trace, then shut down the instance. I believe they shut down the site before the heap dump finished. The log had no errors/exceptions/evidence of problems--probably because the JVM was killed before it could generate an error message.

Original Question
We had a recent situation where the application appeared --to the end user--to hang. We got a stack trace before the application restart and I found some surprising results: of 527 threads, 463 had thread state BLOCKED.

In the Past
In the past blocked thread usually had this issue:
1) some obvious bottleneck: e.g. some database record lock or file system lock problem which caused other threads to wait.
2) All blocked threads would block on the same class/method (e.g. the jdbc or file system clases)

Unusual Data
In this case, I see all sorts of classes/methods blocked, including jvm internal classes, jboss classes, log4j, etc, in addition to application classes (including jdbc and lucene calls)

The question
what would cause a JVM to block log4j.Hierarchy.getLogger, java.lang.reflect.Constructor.newInstance? Obviously some resource "is scarce" but which resource?

thanks

will

Stack Trace Excerpts

http-0.0.0.0-80-417" daemon prio=6 tid=0x000000000f6f1800 nid=0x1a00 waiting for monitor entry [0x000000002dd5d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
                at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
                at java.lang.Class.newInstance0(Class.java:355)
                at java.lang.Class.newInstance(Class.java:308)
                at org.jboss.ejb.Container.createBeanClassInstance(Container.java:630)

http-0.0.0.0-80-451" daemon prio=6 tid=0x000000000f184800 nid=0x14d4 waiting for monitor entry [0x000000003843d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at java.lang.Class.getDeclaredMethods0(Native Method)
                at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
                at java.lang.Class.getMethod0(Class.java:2670)

"http-0.0.0.0-80-449" daemon prio=6 tid=0x000000000f17d000 nid=0x2240 waiting for monitor entry [0x000000002fa5f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.register(Http11Protocol.java:638)
                - waiting to lock <0x00000007067515e8> (a org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.createProcessor(Http11Protocol.java:630)


"http-0.0.0.0-80-439" daemon prio=6 tid=0x000000000f701800 nid=0x1ed8 waiting for monitor entry [0x000000002f35b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:261)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:242)
                at org.apache.log4j.LogManager.getLogger(LogManager.java:198)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

云归处 2024-10-05 19:14:10

这些大致按照我尝试的顺序列出,具体取决于收集的证据:

  • 您看过GC 行为吗?你有记忆压力吗?这可能会导致 newInstance() 和上面的其他一些被阻止。使用 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc 运行虚拟机并记录输出。您是否在接近故障/锁定时间时看到过多的 GC 时间?
    • 该条件可重复吗?如果是这样,请尝试在 JVM (-Xmx) 中改变堆大小,并查看行为是否发生显着变化。如果是这样,请查找内存泄漏或为您的应用适当调整堆大小。
    • 如果前面的方法很困难,并且您在应该出现 OutOfMemoryError 的情况下却没有出现,您可以调整 GC 可调参数...请参阅 JDK6.0 XX 选项,或 JDK6.0 GC 调优白皮书。具体查看 -XX:+UseGCOverheadLimit-XX:+GCTimeLimit 以及相关选项。 (请注意,这些没有详细记录,但可能有用......)
  • 可能会出现死锁吗?仅凭堆栈跟踪摘录,无法在此处确定。查找线程被阻塞的监视器状态之间的循环(相对于它们所持有的内容)。我相信 jconsole 可以为您做到这一点......(
  • 尝试执行几次重复堆栈跟踪并查找哪些变化与哪些保持不变...
  • 进行取证...对于每个显示“BLOCKED”的堆栈条目,查找特定的代码行并确定那里是否有监视器。如果有实际的监视器获取,那么识别限制资源应该相当容易。但是,如果没有透明可用的监视器,您的某些线程可能会显示为阻塞,这将更加棘手......

These are listed roughly in the order I would try them, depending on the evidence collected:

  • Have you looked at GC behavior? Are you under memory pressure? That could result in newInstance() and a few others above being blocked. Run your VM with -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc and log the output. Are you seeing excessive GC times near the time of failure/lockup?
    • Is the condition repeatable? If so, try with varying heap sizes in the JVM (-Xmx) and see if the behavior changes substantially. If so, look for memory leaks or properly size the heap for your app.
    • If the previous is tough, and you're not getting an OutOfMemoryError when you should, you can tune the GC tunables... see JDK6.0 XX options, or JDK6.0 GC Tuning Whitepaper. Look specifically at -XX:+UseGCOverheadLimit and -XX:+GCTimeLimit and related options. (note these are not well documented, but may be useful...)
  • Might there be a deadlock? With only stack trace excerpts, can't determine here. Look for cycles amongst the monitor states that threads are blocked on (vs. what they hold). I believe jconsole can do this for you ... (yep, under the threads tab, "detect deadlocks")
  • Try doing several repeated stacktraces and look for what changes vs. what stays the same...
  • Do the forensics... for each stack entry that says "BLOCKED", go look up the specific line of code and figure out whether there is a monitor there or not. If there's an actual monitor acquisition, it should be fairly easy to identify the limiting resource. However, some of your threads may show blocked without a transparently available monitor, these will be trickier...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文