获取 Java 线程 ID 和失控 Java 线程的堆栈跟踪
在我最繁忙的生产安装中,有时我会遇到一个似乎陷入无限循环的线程。 经过大量研究和调试,我还没有弄清楚谁是罪魁祸首,但看起来应该是可能的。 以下是血淋淋的细节:
当前调试说明:
1) ps -eL 18975 显示问题子线程的 Linux pid,19269
$ps -eL | grep 18975
...
PID LWP TTY TIME CMD
18975 18994 ? 00:00:05 java
18975 19268 ? 00:00:00 java
18975 19269 ? 05:16:49 java
18975 19271 ? 00:01:22 java
18975 19273 ? 00:00:00 java
...
2 ) jstack -l 18975 表示没有死锁,jstack -m 18975 不起作用
3) jstack -l 18975 确实给了我堆栈跟踪我所有的线程(~400)。 示例线程堆栈(而不是问题):
"http-342.877.573.944-8080-360" daemon prio=10 tid=0x0000002adaba9c00 nid=0x754c in Object.wait() [0x00000000595bc000..0x00000000595bccb0] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on (a org.apache.tomcat.util.net.JIoEndpoint$Worker) at java.lang.Object.wait(Object.java:485) at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:416) - locked (a org.apache.tomcat.util.net.JIoEndpoint$Worker) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:442) at java.lang.Thread.run(Thread.java:619)
4) ps -eL 输出的线程 ID 与 jstack 的输出不匹配,或者至少我看不到它。 (jstack 文档有点稀疏。)
5) 没有大量 IO、内存使用或其他相应的活动线索可供使用。
平台:
- Java 6
- Tomcat 6
- RHEL 4(64 位)
有谁知道如何从 linux ps 输出到我的问题子 java 线程建立连接? 如此接近,却又如此遥远……
On my busiest production installation, on occasion I get a single thread that seems to get stuck in an infinite loop. I've not managed to figure out who is the culprit, after much research and debugging, but it seems like it should be possible. Here are the gory details:
Current debugging notes:
1) ps -eL 18975 shows me the the Linux pid the problem child thread, 19269
$ps -eL | grep 18975
...
PID LWP TTY TIME CMD
18975 18994 ? 00:00:05 java
18975 19268 ? 00:00:00 java
18975 19269 ? 05:16:49 java
18975 19271 ? 00:01:22 java
18975 19273 ? 00:00:00 java
...
2) jstack -l 18975 says there are no deadlocks, jstack -m 18975 does not work
3) jstack -l 18975 does give me the stack trace for all my threads (~400). Example thread stack (and not the problem):
"http-342.877.573.944-8080-360" daemon prio=10 tid=0x0000002adaba9c00 nid=0x754c in Object.wait() [0x00000000595bc000..0x00000000595bccb0] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on (a org.apache.tomcat.util.net.JIoEndpoint$Worker) at java.lang.Object.wait(Object.java:485) at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:416) - locked (a org.apache.tomcat.util.net.JIoEndpoint$Worker) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:442) at java.lang.Thread.run(Thread.java:619)
4) The ps -eL output's thread ID does not match the output from jstack, or at least I cannot see it. (jstack documentation is a bit sparse.)
5) There are no heavy IO, memory usage or other corresponding activity clues to work with.
Platform:
- Java 6
- Tomcat 6
- RHEL 4 (64-bit)
Does anybody know how I can make that connection from the linux ps output to my problem child java thread? So close, yet so far...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
看起来 jstack 输出中的 nid 是 Linux LWP id。
将 nid 转换为十进制,您就得到了 LWP id。 在你的例子中,0x754c 是 30028。这个过程没有显示在我们的 ps 输出中,但它可能是你为了节省空间而省略的 LWP 之一。
下面是一个 Perl 片段,您可以使用它来将 jstack 的输出通过管道传输到:
It looks like the nid in the jstack output is the Linux LWP id.
Convert the nid to decimal and you have the LWP id. In your case 0x754c is 30028. This process is not shown in our ps output, but it was probably one of the LWPs you have omitted to save space.
Here's a little a Perl snippet you can use to pipe the output of jstack to:
您可以使用 JConsole 查看线程的堆栈跟踪。
如果您使用 JDK 1.6.0_07 或更高版本,您还可以使用 视觉虚拟机。
这两个工具都提供了应用程序中所有正在运行的线程的良好视图。 VisualVM 好一些,但希望查看所有线程可以帮助您追踪失控的线程。
检查始终处于“运行”状态的线程。 当我们有一个失控的线程时,堆栈跟踪会不断变化。 因此我们能够知道循环正在调用哪些方法,并跟踪循环。
You can use JConsole to view the thread's stack trace.
If your using JDK 1.6.0_07 or above, you can also use visualvm.
Both tools provide a nice view of all the running threads in an application. The visualvm is quite a bit nicer, but hopefully seeing all the threads can help you track down the run-away thread.
Check for threads that are always in a state of RUNNING. When we had a run-away thread, the stack trace would constantly change. So we were able to tell which methods the loop was calling, and track down the loop.
不错,有用的答案!
对于 Linux,使用 ps -efL,-L 选项将显示 LWP。
作为旁注,
“http-342.877.573.944-8080-360”守护进程 prio=10 表示
“ThreadName(由 JVM 指定)”运行模式(从 pid 继承)优先级(从 pid 继承)
Nice,useful answers!
For Linux, use ps -efL, -L option will show the LWPs.
As a side note, the
"http-342.877.573.944-8080-360" daemon prio=10 means
"ThreadName(as given by the JVM)" runningmode(inherited from the pid) priority(inherited from the pid)
从内存中,如果您在控制台上按 CTRL-BREAK,您将获得当前线程及其一些堆栈跟踪帧的转储。
根据记忆(我不确定这是否是 IntelliJ IDEa 功能,或者它是 java 中的默认功能),但它会告诉您哪个线程死锁,以及它们正在等待哪个对象。 您应该能够将输出重定向到文件,并且只需 grep 即可查找 DEADLOCKED 文本。
JConsole、VisualVM 或其他分析器(例如 JProfiler)也会向您显示线程及其堆栈,但是如果您不想使用任何外部工具,我认为 CTRL-BREAK 会给您您正在寻找的东西。
From memory if you CTRL-BREAK on the console you will get a dump of the current threads and a few of their stack trace frames.
From memory (I'm not sure if this is an IntelliJ IDEa feature, or it is default in java) but it will tell you which thread is deadlocked, and which object they are waiting on. You should be able to redirect the output to a file, and just grep for the DEADLOCKED text.
JConsole, VisualVM or other profilers such as JProfiler will also show you the threads and their stacks, however if you don't want to use any external tool I think CTRL-BREAK will give you what you're looking for.
在 SUN 上
请注意,
prstat
默认情况下显示轻量级进程的数量,而不是 LWPID。要查看特定用户的所有轻量级进程的信息,请使用
-L
选项。现在使用 LWPID 并将其转换为十六进制并将其与线程转储中的 nid 进行匹配
On SUN
Note that
prstat
by default shows the no of light weight processes , not the LWPID.To see information for all the lightweight processes for a particular user use the
-L
option.now use the LWPID and convert it into hex and match it with the nid from the thread dump