针对 Linux 进程挂起问题的调试实用程序?

发布于 2024-09-06 04:51:37 字数 159 浏览 4 评论 0原文

我有一个执行配置管理的守护进程。所有其他进程都应与该守护进程交互才能发挥作用。但是,当我执行大型操作时,几个小时后,守护进程在 2 到 3 小时内没有响应。 2-3小时后即可正常工作。

针对 Linux 进程挂起问题的调试实用程序?

如何知道linux进程在什么时候挂起?

I have a daemon process which does the configuration management. all the other processes should interact with this daemon for their functioning. But when I execute a large action, after few hours the daemon process is unresponsive for 2 to 3 hours. And After 2- 3 hours it is working normally.

Debugging utilities for Linux process hang issues?

How to get at what point the linux process hangs?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜夜流光相皎洁 2024-09-13 04:51:37
  • strace 可以显示最后的系统调用及其结果
  • lsof 可以显示打开的文件
  • 当写入日志消息来跟踪进度时,系统日志非常有效。允许将问题限制在较小的区域内。 这通常会在wireshark中出现有趣的结果
  • 还将日志消息与来自其他系统的其他消息相关联,如果应用程序使用套接字使线路颤动可见,
  • 。 ps ax + top 可以显示您的应用程序是否处于繁忙循环中,即一直运行、休眠或阻塞 IO、消耗 CPU、使用内存。

其中每一个都可能提供一些信息,这些信息共同构成了问题的全貌。

使用 gdb 时,当应用程序被阻止时触发核心转储可能会很有用。然后您将获得一个静态快照,您可以在闲暇时使用事后调试进行分析。您可以通过脚本触发这些。您可以快速构建一组可用于测试您的理论的快照。

  • strace can show the last system calls and their result
  • lsof can show open files
  • the system log can be very effective when log messages are written to track progress. Allows to box the problem in smaller areas. Also correlate log messages to other messages from other systems, this often turns up interesting results
  • wireshark if the apps use sockets to make the wire chatter visible.
  • ps ax + top can show if your app is in a busy loop, i.e. running all the time, sleeping or blocked in IO, consuming CPU, using memory.

Each of these may give a little bit of information which together build up a picture of the issue.

When using gdb, it might be useful to trigger a core dump when the app is blocked. Then you have a static snapshot which you can analyze using post mortem debugging at your leisure. You can have these triggered by a script. The you quickly build up a set of snapshots which can be used to test your theories.

演出会有结束 2024-09-13 04:51:37

一种选择是使用 gdb 并使用 attach 命令来附加到正在运行的进程。您需要加载包含相关可执行文件符号的文件(使用 file 命令)

One option is to use gdb and use the attach command in order to attach to a running process. You will need to load a file containing the symbols of the executable in question (using the file command)

农村范ル 2024-09-13 04:51:37

有多种不同的方法可以实现:

  1. 侦听 UNIX 域套接字,以处理状态请求。然后,外部应用程序可以询问该应用程序是否仍然正常。如果在某个超时时间内没有得到响应,则可以认为正在查询的应用程序已死锁或已死亡。

  2. 定期触摸具有预选路径的文件。外部应用程序可以查看文件的时间戳,如果它已过时,则可以假设该应用程序已死或死锁。

  3. 您可以重复使用alarm系统调用,让信号终止进程(相应地使用sigaction)。只要您继续调用alarm(即只要您的程序正在运行)它就会继续运行。一旦不这样做,信号就会触发。

当进程终止时,您可以使用 forkwaitpid 无缝地重新启动进程,如在此答案中。它不会花费任何大量资源,因为操作系统将共享内存页面。

There are a number of different ways to do:

  1. Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.

  2. Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.

  3. You can use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.

You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文