第三方库中的错误导致工作线程中出现无限循环在我的 JBoss 实例上。您知道有一种方法可以在不重新启动服务器的情况下杀死这个“卡住”的线程吗?我们希望能够从此恢复,直到部署修复程序为止,最好无需重新启动。
我见过一些人提到使用 Thread.interrupt() - 如果我要编写自己的 MBean,如何获取相关线程的句柄以中断它?
更新:无法使用这些方法中的任何一种来解决问题。我确实遇到了关于同一问题的另一个线程,其中有一个链接说明为什么Thread.stop() 已弃用。其他人提出了类似的问题并得到了类似的结果。看起来更复杂的容器应该提供这种健康机制,但我猜他们的手被 JVM 束缚了。
A bug in a third party library is causing an infinite loop in a worker thread on a JBoss instance of mine. Do you know of a way to kill this "stuck" thread without restarting the server? We'd like to be able to recover from this until a fix is deployed, preferably without having to restart.
I've seen a few people mention using Thread.interrupt() - if I were to code my own MBean, how would I get a handle to the thread in question in order to interrupt it?
Update: Wasn't able to solve using any of these methods. I did come across another thread about the same issue that had a link to why Thread.stop() is deprecated. Someone else has asked a similar question with similar results. It seems like more sophisticated containers should provide this kind of health mechanism, but I guess their hands are tied w/r/t the JVM.
发布评论
评论(4)
我在第 3 方库中遇到了类似的错误(无限循环)。我最终自己应用了修复程序(同时等待来自第 3 方库的人员修复他们的混乱),然后我将修改后的 .class 放入我的 .war 中,确保它在伪造的 .class 之前加载(伪造的类位于伪造的第 3 方 .jar 内)。
这不是很好,但它可以工作,请在这里查看我的问题:
Order of class loading from a .war file
我的意思是这个:如果您必须等待负责第 3 方 bug 库的人员修复他们的内容,您可能会等待非常很长时间。我们负担不起。我们需要尽快修复。所以我们最终对他们的代码应用了补丁/黑客。
例如,您可以在无限循环内添加布尔检查,然后在您希望虚假线程“死亡”时强制循环退出。
请注意,我已经十年没有使用过已弃用的线程 stop() 了,而且我真的不想在上述情况下使用它。
I had a similar bug (infinite loop) in a 3rd party lib. I ended up applying the fix myself (while waiting for the people from the 3rd party lib to fix their mess) and then I placed the modified .class in my .war, making sure it is loaded before the bogus .class (the bogus one being inside the bogus 3rd party .jar).
It is not nice but it works, see my question here:
Order of class loading from a .war file
What I mean is this: if you have to wait for the people responsible for the 3rd party bugged lib to fix their stuff, you can potentially be waiting a very long time. We couldn't afford that. We needed a fix ASAP. So we ended up applying a patch/hack to their code.
You could for example add a boolean check inside the infinite loop and then forcing the loop to exit when you want the bogus thread to "die".
Note that I haven't used the deprecated Thread stop() since ten years and I really didn't want to use it in the above case.
我认为最困难的部分是识别悬挂的线。您没有提供有关它的信息,但也许您可以围绕线程的名称或其当前堆栈跟踪构建一些规则。
如果您可以通过名称识别线程,我将通过使用 Thread.currentThread().getThreadGroup() 获取我自己的线程组来获取虚拟机中的所有线程,然后通过以下方式向上遍历线程组层次结构在线程组上调用
getParent()
,直到返回null
。您现在拥有顶级线程组。现在,您可以使用顶级线程组上的enumerate(Thread[] list)
方法填充所有线程的预分配数组。如果您无论如何都需要堆栈跟踪来识别线程,您还可以使用静态实用程序方法
Map Thread.getAllStackTraces()
获取所有线程。然而,计算堆栈跟踪的成本相当昂贵,因此如果您实际上不需要它们,这可能不是最好的解决方案。识别线程后,您必须对其调用
stop()
方法。中断它不会有任何帮助,除非正在运行的代码的实现实际上评估了线程的中断标志并按照您的预期运行。并不是说stop()
方法已被弃用,而且使用它可能会产生许多有趣的副作用。您可以在 API 文档中找到更多详细信息。I suppose the most difficult part is to identify the hanging thread. You provide no info about it, but perhaps you can build some rules around the thread's name or its current stack trace.
If you can identify the thread by its name, I would get all threads in the VM by getting my own thread group with
Thread.currentThread().getThreadGroup()
, then walk up the thread group hierarchy by callinggetParent()
on the thread group until it returnsnull
. You now have the top level thread group. You can now fill a preallocated array with all threads using theenumerate(Thread[] list)
method on the top level thread group.If you need the stack traces anyway to identify the thread, you can also use the static utility method
Map<Thread,StackTraceElement[]> Thread.getAllStackTraces()
to get all threads. Computing the stack traces is however quite expensive, so this might not be the best solution if you don't actually need them.After identifying the thread you must call the
stop()
method on it. Interrupting it won't help, unless the implementation of the running code actually evaluates the thread's interrupted flag and behaves as you expect it to. Not that thestop()
method is deprecated and that using it may have many funny side effects. You can find more details in the API documentation.您可以使用不鼓励的 myThread.stop() 方法。但是很可能该线程仍然在那里被引用,因此您应该使用一些反射魔法从持有该线程的组件中删除对该线程的所有引用。
如何找到话题?使用 Thread.getThreadGroup() 和 ThreadGroup.getThreadGroup() 向上到根 ThreadGroup(),然后使用 iterate() 函数遍历所有线程。
You could use the discouraged myThread.stop() method. But then it is very likely the Thread is still referenced there, so you should use some reflection magic to remove all references to this thread from the components holding it.
How to find the Thread? Use Thread.getThreadGroup() and ThreadGroup.getThreadGroup() to go up to the root ThreadGroup(), and then use the iterate() functions to go through all threads.
尝试我的 jkillthread 尝试执行类似的操作。
Try my jkillthread which tries to do something like this.