符号如何影响调用堆栈遍历?
我正在尝试使用 Windbg 分析故障转储,并且根据加载的符号,我会得到不稳定的故障转储。我的简单理解是,这些符号只是帮助指向堆栈所指的内容,但堆栈本身并没有改变。这显然是错误的,但现在我不知道我到底在看什么。
这是一个加载了所有符号的调用堆栈:
0:000> kn
# ChildEBP RetAddr
00 0012e120 7d61f60f ntdll!ZwGetContextThread+0x12
01 0012e130 000f0005 ntdll!RtlFreeHeap+0x711
WARNING: Frame IP not in any known module. Following frames may be wrong.
02 0012e1d0 6d5b5b20 0xf0005
03 0012e314 6d5b407f dbghelp!Win32LiveSystemProvider::OpenMapping+0x228
04 0012e464 0012e488 dbghelp!GenAllocateModuleObject+0x1ad
05 0012e4e4 6d5b588e 0x12e488
06 0012e69c 7d4d132f dbghelp!Win32LiveSystemProvider::GetOsCsdString+0x4d
07 0012e6b8 6d5b5fd2 kernel32!ReadProcessMemory+0x1b
08 0012e6e0 6d5b604e dbghelp!Win32LiveSystemProvider::ReadVirtual+0x3d
09 0012e700 6d5b2f3d dbghelp!Win32LiveSystemProvider::ReadAllVirtual+0x1d
0a 0012e728 6d5b304f dbghelp!WriteMemoryFromProcess+0x35
0b 0012e7ac 6d5b345b dbghelp!WriteThreadList+0xc1
0c 0012e7cc 6d5b367b dbghelp!WriteDumpData+0x83
0d 0012e90c 6d5b3778 dbghelp!MiniDumpProvideDump+0x174
*** WARNING: Unable to verify checksum for ERRHNDLR.dll
0e 0012e96c 0091235d dbghelp!MiniDumpWriteDump+0xc8
*** WARNING: Unable to verify timestamp for msvcr90.dll
0f 0012e9fc 7857dcaa ERRHNDLR!ExceptionTranslator+0x25d [c:\redacted\errorhandler.cpp @ 230]
10 0012ea48 7857d4f5 msvcr90!_CallSETranslator+0xa5
11 0012ea7c 7857d8c0 msvcr90!__CxxExceptionFilter+0x217
12 0012eadc 7857d9dd msvcr90!__CxxExceptionFilter+0x5e2
13 0012eb10 7857db94 msvcr90!__InternalCxxFrameHandler+0xdb
*** WARNING: Unable to verify checksum for PROGRAM.exe
14 0012eb84 004f1c9e msvcr90!__CxxFrameHandler3+0x26
15 0012eba8 004f1c9e PROGRAM!__sse2_available_init+0x1269c
16 0012ec0c 00130000 PROGRAM!__sse2_available_init+0x1269c
17 00000000 00000000 0x130000
我可以看出发生了一些不好的事情,但它似乎是在应用程序启动后就发生的,但事实并非如此。
这是相同的调用堆栈,但没有加载 msvcr90 的符号
0:000> kn
# ChildEBP RetAddr
00 0012e120 7d61f60f ntdll!ZwGetContextThread+0x12
01 0012e130 000f0005 ntdll!RtlFreeHeap+0x711
WARNING: Frame IP not in any known module. Following frames may be wrong.
02 0012e1d0 6d5b5b20 0xf0005
03 0012e314 6d5b407f dbghelp!Win32LiveSystemProvider::OpenMapping+0x228
04 0012e464 0012e488 dbghelp!GenAllocateModuleObject+0x1ad
05 0012e4e4 6d5b588e 0x12e488
06 0012e69c 7d4d132f dbghelp!Win32LiveSystemProvider::GetOsCsdString+0x4d
07 0012e6b8 6d5b5fd2 kernel32!ReadProcessMemory+0x1b
08 0012e6e0 6d5b604e dbghelp!Win32LiveSystemProvider::ReadVirtual+0x3d
09 0012e700 6d5b2f3d dbghelp!Win32LiveSystemProvider::ReadAllVirtual+0x1d
0a 0012e728 6d5b304f dbghelp!WriteMemoryFromProcess+0x35
0b 0012e7ac 6d5b345b dbghelp!WriteThreadList+0xc1
0c 0012e7cc 6d5b367b dbghelp!WriteDumpData+0x83
0d 0012e90c 6d5b3778 dbghelp!MiniDumpProvideDump+0x174
*** WARNING: Unable to verify checksum for ERRHNDLR.dll
0e 0012e96c 0091235d dbghelp!MiniDumpWriteDump+0xc8
*** WARNING: Unable to verify timestamp for msvcr90.dll
*** ERROR: Module load completed but symbols could not be loaded for msvcr90.dll
0f 0012e9fc 7857dcaa ERRHNDLR!ExceptionTranslator+0x25d [c:redacted\errorhandler.cpp @ 230]
10 0012ea48 7857d4f5 msvcr90+0x5dcaa
11 0012ea7c 7857d8c0 msvcr90+0x5d4f5
12 0012eadc 7857d9dd msvcr90+0x5d8c0
13 0012eb10 7857db94 msvcr90+0x5d9dd
14 0012eb4c 7d61ec4a msvcr90+0x5db94
15 0012eb70 7d61ec1b ntdll!ExecuteHandler2+0x26
16 0012ec18 7d61ea56 ntdll!ExecuteHandler+0x24
17 0012ec18 026fe31a ntdll!KiUserExceptionDispatcher+0xe
*** WARNING: Unable to verify checksum for Storage.dll
18 0012ef4c 026fddd0 Storage!CList<Property *,Property *>::AddTail+0xa [c:\program files (x86)\microsoft visual studio 9.0\vc\atlmfc\include\afxtempl.h @ 1003]
*** WARNING: Unable to verify checksum for Storage2.dll
19 0012ef54 0274f5ec Storage!PropertyList::Add+0x10 [c:\redacted\propertylist.cpp @ 236]
1a 0012ef5c 0012f280 Storage2!Thing::Process+0x12c [c:\redacted\thing.cpp @ 345]
1b 0012ef60 0fe8be80 0x12f280
*** WARNING: Unable to verify checksum for PROGRAM.exe
1c 0012f368 0043d9a1 0xfe8be80
1d 0012f3b0 004f1c9e PROGRAM!View::SelectObject+0x151 [c:\redacted\view.cpp @ 2724]
1e 0012f3d4 004ea73b PROGRAM!__sse2_available_init+0x1269c
*** WARNING: Unable to verify checksum for DLL1.dll
1f 0012f450 02847893 PROGRAM!__sse2_available_init+0xb139
*** WARNING: Unable to verify checksum for DLL2.dll
20 0012f4ac 02c06398 DLL1!_RawDllMainProxy+0x1ed5
21 0012f534 02c06b86 DLL2!__sse2_available_init+0x40eb
22 0012f5a8 02c03fdd DLL2!__sse2_available_init+0x48d9
23 0012f5e0 02c052f4 DLL2!__sse2_available_init+0x1d30
24 0012f664 0283c231 DLL2!__sse2_available_init+0x3047
25 0012f6b4 028475aa DLL1!Logic::Send+0x121 [c:\redacted\logic.cpp @ 438]
26 0012f750 7d94757c DLL1!_RawDllMainProxy+0x1bec
27 0012f7a4 00000000 user32!UserCallWinProcCheckWow+0x128
嘿,这实际上可能很有用!当我使用它来调试故障转储时,它也更接近 Visual Studio 中显示的内容。但是VS的调用堆栈在“Storage2!Thing::Process”下面完全不同,这表明不相关的函数以某种方式存在于调用堆栈中,这就是我尝试windbg的原因。
那么,我错过了什么?为什么卸载符号会揭示可能更有用的调用堆栈?
I'm trying to analyze a crash dump with windbg, and I'm getting inconstant crash dumps depending on what symbols are loaded. My simple understanding is that the symbols only help point to what the stack is referring to, but the stack itself is unchanged. That's obviously wrong, but now I don't know what the heck I'm looking at.
Heres a call stack with all symbols loaded:
0:000> kn
# ChildEBP RetAddr
00 0012e120 7d61f60f ntdll!ZwGetContextThread+0x12
01 0012e130 000f0005 ntdll!RtlFreeHeap+0x711
WARNING: Frame IP not in any known module. Following frames may be wrong.
02 0012e1d0 6d5b5b20 0xf0005
03 0012e314 6d5b407f dbghelp!Win32LiveSystemProvider::OpenMapping+0x228
04 0012e464 0012e488 dbghelp!GenAllocateModuleObject+0x1ad
05 0012e4e4 6d5b588e 0x12e488
06 0012e69c 7d4d132f dbghelp!Win32LiveSystemProvider::GetOsCsdString+0x4d
07 0012e6b8 6d5b5fd2 kernel32!ReadProcessMemory+0x1b
08 0012e6e0 6d5b604e dbghelp!Win32LiveSystemProvider::ReadVirtual+0x3d
09 0012e700 6d5b2f3d dbghelp!Win32LiveSystemProvider::ReadAllVirtual+0x1d
0a 0012e728 6d5b304f dbghelp!WriteMemoryFromProcess+0x35
0b 0012e7ac 6d5b345b dbghelp!WriteThreadList+0xc1
0c 0012e7cc 6d5b367b dbghelp!WriteDumpData+0x83
0d 0012e90c 6d5b3778 dbghelp!MiniDumpProvideDump+0x174
*** WARNING: Unable to verify checksum for ERRHNDLR.dll
0e 0012e96c 0091235d dbghelp!MiniDumpWriteDump+0xc8
*** WARNING: Unable to verify timestamp for msvcr90.dll
0f 0012e9fc 7857dcaa ERRHNDLR!ExceptionTranslator+0x25d [c:\redacted\errorhandler.cpp @ 230]
10 0012ea48 7857d4f5 msvcr90!_CallSETranslator+0xa5
11 0012ea7c 7857d8c0 msvcr90!__CxxExceptionFilter+0x217
12 0012eadc 7857d9dd msvcr90!__CxxExceptionFilter+0x5e2
13 0012eb10 7857db94 msvcr90!__InternalCxxFrameHandler+0xdb
*** WARNING: Unable to verify checksum for PROGRAM.exe
14 0012eb84 004f1c9e msvcr90!__CxxFrameHandler3+0x26
15 0012eba8 004f1c9e PROGRAM!__sse2_available_init+0x1269c
16 0012ec0c 00130000 PROGRAM!__sse2_available_init+0x1269c
17 00000000 00000000 0x130000
I can tell that something bad happened, but it appears to have happened as soon as the app started, which isn't the case.
Heres the same call stack but without the symbols for msvcr90 loaded
0:000> kn
# ChildEBP RetAddr
00 0012e120 7d61f60f ntdll!ZwGetContextThread+0x12
01 0012e130 000f0005 ntdll!RtlFreeHeap+0x711
WARNING: Frame IP not in any known module. Following frames may be wrong.
02 0012e1d0 6d5b5b20 0xf0005
03 0012e314 6d5b407f dbghelp!Win32LiveSystemProvider::OpenMapping+0x228
04 0012e464 0012e488 dbghelp!GenAllocateModuleObject+0x1ad
05 0012e4e4 6d5b588e 0x12e488
06 0012e69c 7d4d132f dbghelp!Win32LiveSystemProvider::GetOsCsdString+0x4d
07 0012e6b8 6d5b5fd2 kernel32!ReadProcessMemory+0x1b
08 0012e6e0 6d5b604e dbghelp!Win32LiveSystemProvider::ReadVirtual+0x3d
09 0012e700 6d5b2f3d dbghelp!Win32LiveSystemProvider::ReadAllVirtual+0x1d
0a 0012e728 6d5b304f dbghelp!WriteMemoryFromProcess+0x35
0b 0012e7ac 6d5b345b dbghelp!WriteThreadList+0xc1
0c 0012e7cc 6d5b367b dbghelp!WriteDumpData+0x83
0d 0012e90c 6d5b3778 dbghelp!MiniDumpProvideDump+0x174
*** WARNING: Unable to verify checksum for ERRHNDLR.dll
0e 0012e96c 0091235d dbghelp!MiniDumpWriteDump+0xc8
*** WARNING: Unable to verify timestamp for msvcr90.dll
*** ERROR: Module load completed but symbols could not be loaded for msvcr90.dll
0f 0012e9fc 7857dcaa ERRHNDLR!ExceptionTranslator+0x25d [c:redacted\errorhandler.cpp @ 230]
10 0012ea48 7857d4f5 msvcr90+0x5dcaa
11 0012ea7c 7857d8c0 msvcr90+0x5d4f5
12 0012eadc 7857d9dd msvcr90+0x5d8c0
13 0012eb10 7857db94 msvcr90+0x5d9dd
14 0012eb4c 7d61ec4a msvcr90+0x5db94
15 0012eb70 7d61ec1b ntdll!ExecuteHandler2+0x26
16 0012ec18 7d61ea56 ntdll!ExecuteHandler+0x24
17 0012ec18 026fe31a ntdll!KiUserExceptionDispatcher+0xe
*** WARNING: Unable to verify checksum for Storage.dll
18 0012ef4c 026fddd0 Storage!CList<Property *,Property *>::AddTail+0xa [c:\program files (x86)\microsoft visual studio 9.0\vc\atlmfc\include\afxtempl.h @ 1003]
*** WARNING: Unable to verify checksum for Storage2.dll
19 0012ef54 0274f5ec Storage!PropertyList::Add+0x10 [c:\redacted\propertylist.cpp @ 236]
1a 0012ef5c 0012f280 Storage2!Thing::Process+0x12c [c:\redacted\thing.cpp @ 345]
1b 0012ef60 0fe8be80 0x12f280
*** WARNING: Unable to verify checksum for PROGRAM.exe
1c 0012f368 0043d9a1 0xfe8be80
1d 0012f3b0 004f1c9e PROGRAM!View::SelectObject+0x151 [c:\redacted\view.cpp @ 2724]
1e 0012f3d4 004ea73b PROGRAM!__sse2_available_init+0x1269c
*** WARNING: Unable to verify checksum for DLL1.dll
1f 0012f450 02847893 PROGRAM!__sse2_available_init+0xb139
*** WARNING: Unable to verify checksum for DLL2.dll
20 0012f4ac 02c06398 DLL1!_RawDllMainProxy+0x1ed5
21 0012f534 02c06b86 DLL2!__sse2_available_init+0x40eb
22 0012f5a8 02c03fdd DLL2!__sse2_available_init+0x48d9
23 0012f5e0 02c052f4 DLL2!__sse2_available_init+0x1d30
24 0012f664 0283c231 DLL2!__sse2_available_init+0x3047
25 0012f6b4 028475aa DLL1!Logic::Send+0x121 [c:\redacted\logic.cpp @ 438]
26 0012f750 7d94757c DLL1!_RawDllMainProxy+0x1bec
27 0012f7a4 00000000 user32!UserCallWinProcCheckWow+0x128
Hey, that may actually be useful! It's also closer to what is displayed in Visual Studio when I use it to debug the crash dump. But VS's call stack is completely different below "Storage2!Thing::Process", suggesting that unrelated functions are in the call stack somehow, which is why I'm trying windbg.
So, what am I missing? Why should unloading symbols reveal a potentially more useful call stack?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个很长的答案,但简而言之:在 x86 PDB 上包含 FPO 信息,它允许调试器可靠地展开调用堆栈。这在 FPO 帧的情况下是必需的,其中 EBP 不用作帧指针。在没有 PDB 的情况下,调试器假定每个帧都是 EBP 帧,并且将简单地遍历 EBP 链,直到到达末尾(即不可读的 EBP 值)。
有关 FPO 和 EBP 框架的更多详细信息,这里有一篇很好的文章:
http://www.nynaeve.net /?p=91
现在,解决您的问题。您显示的第一个调用堆栈是绝对正确的。某些模块引发了异常,因此操作系统开始展开调用帧以寻找异常处理程序。不幸的是,没有人处理该错误,因此默认的异常处理程序运行,这导致应用程序崩溃。由于有问题的代码的调用堆栈已展开,因此除了堆栈上操作系统提供的组件之外,您看不到任何内容。
在第二种情况下,没有符号,因此操作系统将每个调用帧视为 EBP。在这种情况下,您很“幸运”,捡到了一个垃圾 EBP,它开始展开旧的调用堆栈。虽然在这种情况下它指出了正确的事情,但这是一种转移注意力的行为,可能会导致您使用无效数据开始分析并浪费大量时间(已经做到了!)。
在发生异常的情况下,.excr 命令始终是正确的做法。这是可行的,因为在展开调用帧寻找异常处理程序之前,操作系统会存储异常发生时处理器的寄存器状态。 .excr 命令使用该状态让您及时回到检测到不良状态的那一刻,而不是在操作系统尝试处理它之后。
-斯科特
It's a long answer, but in short: On the x86 PDBs contain FPO information, which allows the debugger to reliably unwind a call stack. This is required in the case of FPO frames, where EBP is not used as a frame pointer. In the absence of PDBs, the debugger assumes that every frame is an EBP frame and will simply walk the EBP chain until it reaches the end (i.e. an unreadable EBP value).
For more details on FPO and EBP frames, there's a good article here:
http://www.nynaeve.net/?p=91
Now, to get to your issue. The first call stack that you showed is absolutely correct. Some module threw an exception, so the O/S began unwinding call frames looking for an exception handler. Unfortunately, no one handled the error so the default exception handler ran, which proceeded to crash the application. Because the call stack of the offending code was unwound, you don't see anything but the O/S supplied components on the stack.
In the second case, you have no symbols and so the O/S treats every call frame as if it's EBP. In this case, you got "lucky" and picked up a garbage EBP that started to unwind an old call stack. While it pointed off to the right thing in this case, this is the sort of red herring that can cause you to start your analysis with invalid data and waste a LOT of time (been there, done that!).
The .excr command is always the correct thing to do in the case of an exception. This works because the O/S stores the register state of the processor at the time of the exception before unwinding call frames looking for an exception handler. The .excr command uses that state to bring you back in time to the moment where the bad state was detected, instead of after the fact while the O/S was trying to handle it.
-scott