Solaris 进程挂在退出状态
在 Solaris 9 和 10(x86 和 Sparc)上,我们有一个进程在退出期间挂起:
fe0b5994 lwp_park (0, 0, 0)
fe0b206c slow_lock (ff388908, fe080400, 0, 0, 98, fe0abe00) + 58
ff376aa8 __deregister_frame_info_bases (2a518, 1, 0, 2daf0, 0, ff376be4) + 4c
00014858 ???????? (0, ff000000, 0, 0, 0, 0)
00019920 _fini (0, 0, 210fc, fe21cbf0, 5, fe25897c) + 4
fe21cbf0 _exithandle (fee66a4c, 0, 40, 0, 0, fe2bc000) + 70
fe2a0564 exit (0, fdefb47c, 40, fdefb8ff, 2c, 0) + 24
fee66a4c (our code) (4e280, 5ab5c, 5aa60, 2ed0, 81010100, fdefb988) + 244
我们的代码是在 Solaris 9 计算机上使用 gcc 3.4.6 编译的。
所涉及的进程是来自多线程父进程的单线程子进程,经过fork
但不是exec
ed。
有人见过类似的东西吗?
您知道更新版本的 gcc 是否可以解决该问题吗?
On Solaris 9 and 10, both x86 and Sparc, we have a process that is hanging during exit:
fe0b5994 lwp_park (0, 0, 0)
fe0b206c slow_lock (ff388908, fe080400, 0, 0, 98, fe0abe00) + 58
ff376aa8 __deregister_frame_info_bases (2a518, 1, 0, 2daf0, 0, ff376be4) + 4c
00014858 ???????? (0, ff000000, 0, 0, 0, 0)
00019920 _fini (0, 0, 210fc, fe21cbf0, 5, fe25897c) + 4
fe21cbf0 _exithandle (fee66a4c, 0, 40, 0, 0, fe2bc000) + 70
fe2a0564 exit (0, fdefb47c, 40, fdefb8ff, 2c, 0) + 24
fee66a4c (our code) (4e280, 5ab5c, 5aa60, 2ed0, 81010100, fdefb988) + 244
Our code is compiled on the Solaris 9 machine, using gcc 3.4.6.
The process in question is a single-threaded child from a multi-threaded parent, fork
ed but not exec
ed.
Has anyone seen anything similar?
Do you know if a more recent version of gcc would fix the problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以尝试调用
_exit()
来退出子进程,而不是exit()
。 exit() 是一个库函数,它在退出之前执行各种形式的库清理——例如,它将 stdio 缓冲区刷新到磁盘。 _exit() 是终止进程的实际系统调用。 即使在单线程程序中,您通常也会在分叉子级中使用 _exit() 来防止库清理发生两次。You could try calling
_exit()
to exit the child process, rather thanexit()
. exit() is a library function that does various forms of library cleanup before exiting--for example, it flushes stdio buffers to disk. _exit() is the actual system call which terminates the process. Even in single-threaded programs, you normally used _exit() inside forked children to prevent library cleanup from happening twice.这正是为什么您应该始终在 MT 进程中 fork 之后执行的原因:您不知道父线程中持有什么锁,也不知道何时可能需要这些锁之一。 这里你在退出时需要一个,但是你无法获取它,因为锁定它的线程在子进程中不存在。
新版本的 GCC 不太可能帮助您。 即使它确实有帮助,你迟早会遇到这样的另一把锁。
要么在创建第一个线程之前进行 fork,要么在 fork 之后立即执行。 这些确实是唯一明智的选择。
This is exactly why you should always exec after fork in a MT process: you don't know what locks other threads held in the parent, and when you may need one of these locks. Here you need one at exit, but you can't get it because the thread that locked it doesn't exist in the child.
New version of GCC is somewhat unlikely to help you. Even if it does help, it's only a matter of time before you hit another lock like this.
Either fork before creating the first thread, or exec immediately after fork. These are really the only sensible choices.