在 OSX 上通过 dlopen 打开库时调试崩溃

发布于 2024-09-13 04:31:58 字数 4921 浏览 1 评论 0原文

我开发的 C++ 应用程序有问题,该应用程序使用 dlopen 加载用户开发的库。在过去的几年里,这个应用程序已经被很多人在各种 Linux 发行版和 OSX 版本上使用,所以我假设我对 dlopen 的使用是好的,依赖它的代码也是如此(是的,这是傲慢的行为,所以当失败时我会报告)。我现在遇到的问题是用户开发了一个无法在我的系统(OSX 10.6.4)上加载的库。当系统尝试加载它时,会出现冻结然后崩溃。崩溃的线程在崩溃报告中看起来像这样:(

Thread 5 Crashed:
0   com.apple.CoreFoundation        0x00007fff80fa6110 __CFInitialize + 1808
1   dyld                            0x00007fff5fc0d5ce ImageLoaderMachO::doImageInit(ImageLoader::LinkContext const&) + 138
2   dyld                            0x00007fff5fc0d607 ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 27
3   dyld                            0x00007fff5fc0bcec ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 236
4   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
5   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
6   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
7   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
8   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
9   dyld                            0x00007fff5fc0bda6 ImageLoader::runInitializers(ImageLoader::LinkContext const&) + 58
10  dyld                            0x00007fff5fc08fbb dlopen + 573
11  libSystem.B.dylib               0x00007fff816492c0 dlopen + 61
12  cast-server-c++                 0x0000000100007819 cast::loadLibrary(std::string const&) + 96 (ComponentCreator.cpp:43)
13  cast-server-c++                 0x00000001000079c7 cast::createComponentCreator(std::string const&) + 24 (ComponentCreator.cpp:87)
14  cast-server-c++                 0x00000001000089c5 cast::CASTComponentFactory::createBase(std::string const&, std::string const&, Ice::Current const&) + 197 (CASTComponentFactory.cpp:27)
15  cast-server-c++                 0x00000001000090e9 cast::CASTComponentFactory::newManagedComponent(std::string const&, std::string const&, bool, Ice::Current const&) + 73 (CASTComponentFactory.cpp:62)
16  libCDL.dylib                    0x00000001009ceb6c cast::interfaces::ComponentFactory::___newManagedComponent(IceInternal::Incoming&, Ice::Current const&) + 218 (CDL.cpp:14904)
17  libCDL.dylib                    0x00000001009cf1d0 cast::interfaces::ComponentFactory::__dispatch(IceInternal::Incoming&, Ice::Current const&) + 572 (CDL.cpp:15057)
18  libIce.3.3.1.dylib              0x00000001000c9078 IceInternal::Incoming::invoke(IceInternal::Handle<IceInternal::ServantManager> const&) + 2004 (Incoming.cpp:484)
19  libIce.3.3.1.dylib              0x0000000100091a5d Ice::ConnectionI::invokeAll(IceInternal::BasicStream&, int, int, unsigned char, IceInternal::Handle<IceInternal::ServantManager> const&, IceInternal::Handle<Ice::ObjectAdapter> const&) + 367 (ConnectionI.cpp:2436)
20  libIce.3.3.1.dylib              0x000000010009bb40 Ice::ConnectionI::message(IceInternal::BasicStream&, IceInternal::Handle<IceInternal::ThreadPool> const&) + 416 (ConnectionI.cpp:1105)
21  libIce.3.3.1.dylib              0x00000001001a9bbc IceInternal::ThreadPool::run() + 3470 (ThreadPool.cpp:523)
22  libIce.3.3.1.dylib              0x00000001001aa4ec IceInternal::ThreadPool::EventHandlerThread::run() + 152 (ThreadPool.cpp:782)
23  libIceUtil.3.3.1.dylib          0x00000001006eb1e9 startHook + 128 (Thread.cpp:375)
24  libSystem.B.dylib               0x00007fff8167c456 _pthread_start + 331
25  libSystem.B.dylib               0x00007fff8167c309 thread_start + 13

如果需要,我可以发布完整的日志,但如果我将其包含在我的帖子中,它就会超出正文限制)

在我运行可执行文件的终端中,崩溃会产生除了运行可执行文件的脚本已捕获信号的通知之外,没有任何输出。

我的问题是如何获得有关可能导致此崩溃的原因的更多信息?如果有人可以提出可能的解决方案,我也很高兴,但首先我至少想知道如何当系统崩溃时生成更多有关实际错误的信息。

如果我在最初由 dlopen 打开的库上运行 otool ,一切看起来都很好(没有丢失链接、符号等)。我的主要怀疑是,正在加载的库所链接的库的特定组合以某种方式导致了此崩溃。可以加载使用这些链接库的不同子集的其他库。根据记录,这些库包括 X11、ZeroC 的 Ice、Player/Stage 和 OpenCV(后两个是使用 MacPorts 安装的依赖项手动编译的)。似乎是 OpenCV 的包含导致了问题,因为链接到除 OpenCV 之外的所有这些库的其他库都可以毫无问题地加载。这些是我的怀疑,但我目前缺乏进一步调查的知识。

谢谢! Nick

更新:感谢 Kaelin 的回答(我之前不知道的 DYLD_PRINT_* 选项),我至少能够确认没有发生任何完全明显的事情。使用调试信息,我能够将问题范围缩小到导致崩溃的一个特定库。事实证明,这个库(libdc1394 通过 OpenCV 中的 libhighgui 链接到我的应用程序)没有正确链接到 CoreServices,这导致了崩溃。由于某种原因,错误被其他东西隐藏,导致最终崩溃。有关 libdc1394 问题的信息,请查看此处。不幸的是,我无法进行一个干净的修复,我可以在这里报告,所以只是设法获得一个运行的应用程序版本,该版本没有链接到狡猾的库(通过在 OpenCV 编译中关闭 libdc1394 )。

I have a problem with a C++ application I've developed which uses dlopen to load user-developed libraries. The application has been used by a variety of people on a variety of linux distros and versions of OSX over the last couple of years and so I'm assuming my usage of dlopen is OK and so is the code that depends on it (yeah, this is hubris, so I'll report back when it fails). The problem I have now is that a user has developed a library which does not load on my system (OSX 10.6.4). When the system tries to load it there is a freeze then a crash. The thread that crashes looks like this in the crash report:

Thread 5 Crashed:
0   com.apple.CoreFoundation        0x00007fff80fa6110 __CFInitialize + 1808
1   dyld                            0x00007fff5fc0d5ce ImageLoaderMachO::doImageInit(ImageLoader::LinkContext const&) + 138
2   dyld                            0x00007fff5fc0d607 ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 27
3   dyld                            0x00007fff5fc0bcec ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 236
4   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
5   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
6   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
7   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
8   dyld                            0x00007fff5fc0bc9d ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) + 157
9   dyld                            0x00007fff5fc0bda6 ImageLoader::runInitializers(ImageLoader::LinkContext const&) + 58
10  dyld                            0x00007fff5fc08fbb dlopen + 573
11  libSystem.B.dylib               0x00007fff816492c0 dlopen + 61
12  cast-server-c++                 0x0000000100007819 cast::loadLibrary(std::string const&) + 96 (ComponentCreator.cpp:43)
13  cast-server-c++                 0x00000001000079c7 cast::createComponentCreator(std::string const&) + 24 (ComponentCreator.cpp:87)
14  cast-server-c++                 0x00000001000089c5 cast::CASTComponentFactory::createBase(std::string const&, std::string const&, Ice::Current const&) + 197 (CASTComponentFactory.cpp:27)
15  cast-server-c++                 0x00000001000090e9 cast::CASTComponentFactory::newManagedComponent(std::string const&, std::string const&, bool, Ice::Current const&) + 73 (CASTComponentFactory.cpp:62)
16  libCDL.dylib                    0x00000001009ceb6c cast::interfaces::ComponentFactory::___newManagedComponent(IceInternal::Incoming&, Ice::Current const&) + 218 (CDL.cpp:14904)
17  libCDL.dylib                    0x00000001009cf1d0 cast::interfaces::ComponentFactory::__dispatch(IceInternal::Incoming&, Ice::Current const&) + 572 (CDL.cpp:15057)
18  libIce.3.3.1.dylib              0x00000001000c9078 IceInternal::Incoming::invoke(IceInternal::Handle<IceInternal::ServantManager> const&) + 2004 (Incoming.cpp:484)
19  libIce.3.3.1.dylib              0x0000000100091a5d Ice::ConnectionI::invokeAll(IceInternal::BasicStream&, int, int, unsigned char, IceInternal::Handle<IceInternal::ServantManager> const&, IceInternal::Handle<Ice::ObjectAdapter> const&) + 367 (ConnectionI.cpp:2436)
20  libIce.3.3.1.dylib              0x000000010009bb40 Ice::ConnectionI::message(IceInternal::BasicStream&, IceInternal::Handle<IceInternal::ThreadPool> const&) + 416 (ConnectionI.cpp:1105)
21  libIce.3.3.1.dylib              0x00000001001a9bbc IceInternal::ThreadPool::run() + 3470 (ThreadPool.cpp:523)
22  libIce.3.3.1.dylib              0x00000001001aa4ec IceInternal::ThreadPool::EventHandlerThread::run() + 152 (ThreadPool.cpp:782)
23  libIceUtil.3.3.1.dylib          0x00000001006eb1e9 startHook + 128 (Thread.cpp:375)
24  libSystem.B.dylib               0x00007fff8167c456 _pthread_start + 331
25  libSystem.B.dylib               0x00007fff8167c309 thread_start + 13

(I can post the full log if needed, but it exceeds the body text limit if I include it in my post)

In the terminal where I'm running the executable the crash produces no output except for the notification that script running the executable has trapped a signal.

My question is how do I get more information on what might be causing this crash? I'm also happy if someone can suggest possible solutions, but to start with I'd at least like to know how to generate more information when the system crashes about what is actually wrong.

If I run otool on the library which is being initially being opened by dlopen everything looks fine (no missing links, symbols etc). My main suspicion is that it is the particular combination of libraries which the library being loaded is linked against which is causing this crash somehow. These other libraries can be loaded which use different subsets of these linked-against libraries. For the record the libraries include X11, ZeroC's Ice, Player/Stage and OpenCV (with the latter 2 compiled manually with dependencies installed using MacPorts). It seems it's the inclusion of OpenCV which causes the problem, as other libraries which link to all of these except OpenCV can be loaded with no problems. These are my suspicions, but I currently lack the know-how to investigate further.

Thanks! Nick

UPDATE: Thanks to Kaelin's answer (the DYLD_PRINT_* options which I wasn't previously aware of) I was able to at least confirm that nothing completely obvious was happening. Using the debug information I was able to narrow the problem down to one particular library which was causing the crash. It turned out that this library (libdc1394 linked into my app via libhighgui in OpenCV) wasn't correctly linked against CoreServices and this was causing the crash. For some reason the error was then hidden by other things, causing the ultimate crash. For info on the libdc1394 problem, look here. Unfortunately I couldn't make a clean fix that I can report here, so just managed to get a version of the app running that didn't link to the dodgy library (by turning off libdc1394 in OpenCV compilation).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

§普罗旺斯的薰衣草 2024-09-20 04:31:58

经过一些进一步的问题和进一步的谷歌搜索后,我最终找到了问题的真正原因。

如果 CoreFoundation 一开始就没有初始化,则无法在(子)线程中调用 dlopen 与 CoreFoundation 链接的库。 CFInitialize 被调用,显然是检查该线程是否是主线程,如果不是,则因 SIGTRAP 崩溃。

http://openradar.appspot.com/7209349

After some further problems and some further Googling I ultimately found the real cause of my problem.

One cannot call dlopen a library linked with CoreFoundation in a (sub) thread if CoreFoundation wasn't initialized in the first place. CFInitialize is called, apparently checks if the thread is the main thread and if it is not, crashes with a SIGTRAP.

http://openradar.appspot.com/7209349

放手` 2024-09-20 04:31:58

dyld 正在共享库中运行初始化程序(想想 C++ 中的静态初始化程序),其中之一导致 CoreFoundation 框架的 __CFInitialize 函数运行。 [这有可能是接触 CoreFoundation 的第一件事吗?] 不管出于什么原因,__CFInitialize 并不高兴。这可能是某种缺失的依赖关系。或者可能是堆已损坏。或者它可能是 CoreFoundation 框架中某种潜在的错误。

我建议通过 a) 使用所有 DYLD_PRINT_* 环境变量集运行 [参见 man dyld] 和 b) 在 MallocDebug 下运行来修剪前两种可能性。如果这些都没有提供任何帮助,那么您可能只能编写一个雷达供 CoreFoundation 人员查看。

dyld is running the initializers in the shared library (think static initializers in C++), and one of them is causing CoreFoundation framework's __CFInitialize function to be run. [Is it possible this is the first thing touching CoreFoundation?] And for whatever reason, __CFInitialize is not happy. This could be some kind of missing dependency. Or it could be the heap is corrupted. Or it could be a latent bug of some kind in CoreFoundation framework.

I would suggest trimming the first two possibilities by a) running with all of the DYLD_PRINT_* environment variables set [see man dyld] and b) running under MallocDebug. If neither of those shed any light, you're probably left with writing a radar for the CoreFoundation folks to look at.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文