如何诊断应用程序退出时的访问冲突
我有一个正在尝试调试崩溃的应用程序。但是,由于以下几个原因很难检测到问题:
- 崩溃发生在关闭时,这意味着有问题的代码不在堆栈中
- 崩溃仅发生在发布版本,意味着符号不可用
通过崩溃,我的意思是以下异常:
0xC0000005: Access violation reading location 0x00000000.
您将使用什么策略来诊断此问题?
到目前为止,我所做的是从程序中删除尽可能多的代码,直到我得到了导致崩溃的最低限度。 它似乎发生在静态链接到项目的代码中,所以这也没有帮助。
I have an application that I'm trying to debug a crash in. However, it is difficult to detect the problem for a few reasons:
- The crash happens at shutdown, meaning the offending code isn't on the stack
- The crash only happens in release builds, meaning symbols aren't available
By crash, I mean the following exception:
0xC0000005: Access violation reading location 0x00000000.
What strategy would you use to diagnose this problem?
What I have done so far is remove as much code from my program until I get the bare minimum that will cause the crash. It seems to be happening in code that is statically linked to the project, so that doesn't help, either.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您甚至可以为发布版本制作符号文件。 执行此操作,运行程序,附加调试器,关闭它,然后在调试器中查看崩溃的原因。
You can make the symbol files even for the release build. Do that, run your program, attach the debugger, close it, and see the cause of the crash in the debugger.
你似乎有一些东西读取空指针 - 从来都不好。
我不确定你在哪个平台。 在Linux下,您可以考虑使用
valgrind
。除了是否存在调试信息之外,发布版本与调试版本有何不同?
您可以构建包含调试信息的静态链接代码吗? 您可以获得静态链接代码的调试版本吗?
You seem to have something reading a null pointer - never good.
I'm not sure what platform you are on. Under Linux, you could consider using
valgrind
.What is different about your release builds from your debug builds apart from the presence or absence of the debug information?
Can you built the statically linked code with debugging information in it? Can you obtain a debug build of the statically linked code?
我将使用的策略正是你所做的。 删除尽可能多的代码,直到问题消失,然后将最后一点添加回来并进行调试。
但是,有问题的可能不是您的代码。 需要注意的一件事 - 我们在 AIX 上发现了这个问题,即使您运行的是 Windows,它也可能类似。
我们有一个第三方库,它动态加载另一个共享库,该库在其初始化例程中设置一个 atexit 函数,以便在进程退出时调用。
然而,当我们的应用程序加载和卸载这些共享库时,当进程退出时,共享库的 atexit 函数不再在内存中,我们转储了核心。
这在从 main() 返回后显示为访问冲突,因此,如果您遇到这种情况,那么几乎可以肯定是同一类事情。 C RTL 启动代码将遍历 atexit 列表并调用其每个函数,无论您对它们做了什么。
当然,如果它在 main() 退出之前崩溃,那么这是一个没有实际意义的问题。
您可以考虑的一件事(实际上,在跟踪和修复特别棘手的错误的成本/效益分析之后,我们已经这样做过一次):将调试版本作为您的产品发送。 如果它没有崩溃,这可能是一个快速修复,可以让产品上市,同时您可以在闲暇时研究更可接受的解决方案。
The strategy I would use is exactly what you've done. Remove as much code as possible until the problem disappears then add that last bit back in and debug it.
However, it may not be your code that's at fault. One thing to watch out for - we found this problem on AIX and, even though you're running Windows, it may be similar.
We had a third party library which dynamically loaded another shared library which, in its initialization routine, set up an atexit function to be called when the process exits.
However, as our application loads and unloads these shared libraries, by the time the process exited, the shared library's atexit function was no longer in memory and we dumped core.
This shows up as an access violation after returning from main() so, if that's what's happening to you, it's almost certainly the same sort of thing. The C RTL startup code will walk the atexit list and call each of its functions, no matter what you've done with them.
Of course, if it's crashing before main() exits, then this is a moot point.
One thing you could consider (and we've actually done this on one occasion after a cost/benefit analysis of tracking down and fixing a particularly thorny bug): send out the debug release as your product. If it's not crashing, that may be a quick fix to get the product out there while you work on a more acceptable solution at your leisure.