如何调试:w3wp.exe 进程由于堆栈溢出而终止(在一台机器上有效,但在另一台机器上无效)
问题
我的 ASP.NET 4.0 应用程序在一台计算机上因堆栈溢出而崩溃,但在另一台计算机上却没有。在我的开发环境中运行良好。当我将站点移动到生产服务器时,它会引发堆栈溢出异常(在事件日志中看到),并且 w3wp.exe 工作进程终止并被另一个进程取代。
到目前为止我尝试过的
作为参考,我使用调试诊断工具来尝试确定哪一段代码导致溢出,但我不确定如何解释它的输出。输出如下。
ASP.NET 网站如何会在一台计算机上导致堆栈溢出,但在另一台计算机上却不会?
经验丰富的领导受到赞赏。我将把最终的解决方案发布在引导我找到它的答案下面。
调试输出
应用程序:w3wp.exe 框架版本:v4.0.30319 描述:由于堆栈溢出,进程被终止。
In w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp the assembly instruction at nlssorting!SortGetSortKey+25 in C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll from Microsoft Corporation has caused a stack overflow exception (0xC00000FD) when trying to write to memory location 0x01d12fc0 on thread 16
Please follow up with the vendor Microsoft Corporation for C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll
Information:DebugDiag determined that this dump file (w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp) is a crash dump and did not perform any hang analysis. If you wish to enable combined crash and hang analysis for crash dumps, edit the IISAnalysis.asp script (located in the DebugDiag\Scripts folder) and set the g_DoCombinedAnalysis constant to True.
Entry point clr!ThreadpoolMgr::intermediateThreadProc
Create time 2/18/2011 9:07:10 PM
Function Arg 1 Arg 2 Arg 3 Source
nlssorting!SortGetSortKey+25 01115a98 00000001 0651a88c
clr!SortVersioning::SortDllGetSortKey+3b 01115a98 08000001 0651a88c
clr!COMNlsInfo::InternalGetGlobalizedHashCode+f0 01115a98 05e90268 0651a88c
mscorlib_ni+2becff 08000001 0000000f 0651a884
mscorlib_ni+255c10 00000001 09ed57bc 01d14348
mscorlib_ni+255bc4 79b29e90 01d14350 79b39ab0
mscorlib_ni+2a9eb8 01d14364 79b39a53 000dbb78
mscorlib_ni+2b9ab0 000dbb78 09ed57bc 01ff39f4
mscorlib_ni+2b9a53 01d14398 01d1439c 00000011
mscorlib_ni+2b9948 0651a884 01d143ec 7a97bf5d
System_ni+15bd65 6785b114 00000000 09ed5748
System_ni+15bf5d 1c5ab292 1b3c01dc 05ebc494
System_Web_ni+6fb165
***These lines below are repeated many times in the log, so I just posted one block of them
1c5a928c 00000000 0627e880 000192ba
1c5a9dce 00000000 0627e7c4 00000000
1c5a93ce 1b3c01dc 05ebc494 1b3c01dc
1c5a92e2
.....(repeated sequence from above)
System_Web_ni+16779c 1b338528 00000003 0629b7a0
System_Web_ni+1677fb 00000000 00000017 0629ac3c
System_Web_ni+167843 00000000 00000003 0629ab78
System_Web_ni+167843 00000000 00000005 0629963c
System_Web_ni+167843 00000000 00000001 0627e290
System_Web_ni+167843 00000000 0627e290 1a813508
System_Web_ni+167843 01d4f21c 79141c49 79141c5c
System_Web_ni+1651c0 00000001 0627e290 00000000
System_Web_ni+16478d 00000001 01ea7730 01ea76dc
System_Web_ni+1646af 0627e290 01d4f4c0 672c43f2
System_Web_ni+164646 00000000 06273aa8 0627e290
System_Web_ni+1643f2 672d1b65 06273aa8 00000000
1c5a41b5 00000000 01d4f520 06273aa8
System_Web_ni+18610c 01d4f55c 0df2a42c 06273f14
System_Web_ni+19c0fe 01d4fa08 0df2a42c 06273e5c
System_Web_ni+152ccd 06273aa8 05e9f214 06273aa8
System_Web_ni+19a8e2 05e973b4 062736cc 01d4f65c
System_Web_ni+19a62d 06a21c6c 79145d80 01d4f7fc
System_Web_ni+199c2d 00000002 672695e8 00000000
System_Web_ni+7b65cc 01d4fa28 00000002 01c52c0c
clr!COMToCLRDispatchHelper+28 679165b0 672695e8 09ee2038
clr!BaseWrapper<Stub *,FunctionBase<Stub *,&DoNothing<Stub *>,&StubRelease<Stub>,2>,0,&CompareDefault<Stub *>,2>::~BaseWrapper<Stub *,FunctionBase<Stub *,&DoNothing<Stub *>,&StubRelease<Stub>,2>,0,&CompareDefault<Stub *>,2>+fa 672695e8 09ee2038 00000001
clr!COMToCLRWorkerBody+b4 000dbb78 01d4f9f8 1a78ffe0
clr!COMToCLRWorkerDebuggerWrapper+34 000dbb78 01d4f9f8 1a78ffe0
clr!COMToCLRWorker+614 000dbb78 01d4f9f8 06a21c6c
1dda1aa 00000001 01b6c7a8 00000000
webengine4!HttpCompletion::ProcessRequestInManagedCode+1cd 01b6c7a8 69f1aa72 01d4fd6c
webengine4!HttpCompletion::ProcessCompletion+4a 01b6c7a8 00000000 00000000
webengine4!CorThreadPoolWorkitemCallback+1c 01b6c7a8 0636a718 0000ffff
clr!UnManagedPerAppDomainTPCount::DispatchWorkItem+195 01d4fe1f 01d4fe1e 0636a488
clr!ThreadpoolMgr::NewWorkerThreadStart+20b 00000000 0636a430 00000000
clr!ThreadpoolMgr::WorkerThreadStart+3d1 00000000 00000000 00000000
clr!ThreadpoolMgr::intermediateThreadProc+4b 000c3470 00000000 00000000
kernel32!BaseThreadStart+34 792b0b2b 000c3470 00000000
NLSSORTING!SORTGETSORTKEY+25In w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp the assembly instruction at nlssorting!SortGetSortKey+25 in C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll from Microsoft Corporation has caused a stack overflow exception (0xC00000FD) when trying to write to memory location 0x01d12fc0 on thread 16
The problem
I have an ASP.NET 4.0 application that crashes with a stack overflow on one computer, but not another. It runs fine on my development environment. When I move the site to the production server, it throws a stack overflow exception (seen in event log) and the w3wp.exe worker process dies and is replaced with another.
What I've tried so far
For reference, I used the debug diagnostic tool to try to determine what piece of code is causing the overflow, but I'm not sure how to interpret the output of it. The output is included below.
How might an ASP.NET website cause a stack overflow on one machine but not on another?
Experienced leads are appreciated. I'll post the resulting solution below the answer that leads me to it.
Debug Output
Application: w3wp.exe Framework Version: v4.0.30319 Description: The process was terminated due to stack overflow.
In w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp the assembly instruction at nlssorting!SortGetSortKey+25 in C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll from Microsoft Corporation has caused a stack overflow exception (0xC00000FD) when trying to write to memory location 0x01d12fc0 on thread 16
Please follow up with the vendor Microsoft Corporation for C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll
Information:DebugDiag determined that this dump file (w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp) is a crash dump and did not perform any hang analysis. If you wish to enable combined crash and hang analysis for crash dumps, edit the IISAnalysis.asp script (located in the DebugDiag\Scripts folder) and set the g_DoCombinedAnalysis constant to True.
Entry point clr!ThreadpoolMgr::intermediateThreadProc
Create time 2/18/2011 9:07:10 PM
Function Arg 1 Arg 2 Arg 3 Source
nlssorting!SortGetSortKey+25 01115a98 00000001 0651a88c
clr!SortVersioning::SortDllGetSortKey+3b 01115a98 08000001 0651a88c
clr!COMNlsInfo::InternalGetGlobalizedHashCode+f0 01115a98 05e90268 0651a88c
mscorlib_ni+2becff 08000001 0000000f 0651a884
mscorlib_ni+255c10 00000001 09ed57bc 01d14348
mscorlib_ni+255bc4 79b29e90 01d14350 79b39ab0
mscorlib_ni+2a9eb8 01d14364 79b39a53 000dbb78
mscorlib_ni+2b9ab0 000dbb78 09ed57bc 01ff39f4
mscorlib_ni+2b9a53 01d14398 01d1439c 00000011
mscorlib_ni+2b9948 0651a884 01d143ec 7a97bf5d
System_ni+15bd65 6785b114 00000000 09ed5748
System_ni+15bf5d 1c5ab292 1b3c01dc 05ebc494
System_Web_ni+6fb165
***These lines below are repeated many times in the log, so I just posted one block of them
1c5a928c 00000000 0627e880 000192ba
1c5a9dce 00000000 0627e7c4 00000000
1c5a93ce 1b3c01dc 05ebc494 1b3c01dc
1c5a92e2
.....(repeated sequence from above)
System_Web_ni+16779c 1b338528 00000003 0629b7a0
System_Web_ni+1677fb 00000000 00000017 0629ac3c
System_Web_ni+167843 00000000 00000003 0629ab78
System_Web_ni+167843 00000000 00000005 0629963c
System_Web_ni+167843 00000000 00000001 0627e290
System_Web_ni+167843 00000000 0627e290 1a813508
System_Web_ni+167843 01d4f21c 79141c49 79141c5c
System_Web_ni+1651c0 00000001 0627e290 00000000
System_Web_ni+16478d 00000001 01ea7730 01ea76dc
System_Web_ni+1646af 0627e290 01d4f4c0 672c43f2
System_Web_ni+164646 00000000 06273aa8 0627e290
System_Web_ni+1643f2 672d1b65 06273aa8 00000000
1c5a41b5 00000000 01d4f520 06273aa8
System_Web_ni+18610c 01d4f55c 0df2a42c 06273f14
System_Web_ni+19c0fe 01d4fa08 0df2a42c 06273e5c
System_Web_ni+152ccd 06273aa8 05e9f214 06273aa8
System_Web_ni+19a8e2 05e973b4 062736cc 01d4f65c
System_Web_ni+19a62d 06a21c6c 79145d80 01d4f7fc
System_Web_ni+199c2d 00000002 672695e8 00000000
System_Web_ni+7b65cc 01d4fa28 00000002 01c52c0c
clr!COMToCLRDispatchHelper+28 679165b0 672695e8 09ee2038
clr!BaseWrapper<Stub *,FunctionBase<Stub *,&DoNothing<Stub *>,&StubRelease<Stub>,2>,0,&CompareDefault<Stub *>,2>::~BaseWrapper<Stub *,FunctionBase<Stub *,&DoNothing<Stub *>,&StubRelease<Stub>,2>,0,&CompareDefault<Stub *>,2>+fa 672695e8 09ee2038 00000001
clr!COMToCLRWorkerBody+b4 000dbb78 01d4f9f8 1a78ffe0
clr!COMToCLRWorkerDebuggerWrapper+34 000dbb78 01d4f9f8 1a78ffe0
clr!COMToCLRWorker+614 000dbb78 01d4f9f8 06a21c6c
1dda1aa 00000001 01b6c7a8 00000000
webengine4!HttpCompletion::ProcessRequestInManagedCode+1cd 01b6c7a8 69f1aa72 01d4fd6c
webengine4!HttpCompletion::ProcessCompletion+4a 01b6c7a8 00000000 00000000
webengine4!CorThreadPoolWorkitemCallback+1c 01b6c7a8 0636a718 0000ffff
clr!UnManagedPerAppDomainTPCount::DispatchWorkItem+195 01d4fe1f 01d4fe1e 0636a488
clr!ThreadpoolMgr::NewWorkerThreadStart+20b 00000000 0636a430 00000000
clr!ThreadpoolMgr::WorkerThreadStart+3d1 00000000 00000000 00000000
clr!ThreadpoolMgr::intermediateThreadProc+4b 000c3470 00000000 00000000
kernel32!BaseThreadStart+34 792b0b2b 000c3470 00000000
NLSSORTING!SORTGETSORTKEY+25In w3wp__PID__5112__Date__02_18_2011__Time_09_07_31PM__671__First Chance Stack Overflow.dmp the assembly instruction at nlssorting!SortGetSortKey+25 in C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll from Microsoft Corporation has caused a stack overflow exception (0xC00000FD) when trying to write to memory location 0x01d12fc0 on thread 16
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这个问题有点老了,但我刚刚找到了一种在溢出之前获取应用程序堆栈跟踪的好方法,我想与其他谷歌用户分享它:
当你的 ASP.NET 应用程序崩溃时,一组调试文件转储到此主文件夹内的“崩溃文件夹”中:
<块引用>
C:\ProgramData\Microsoft\Windows\WER\ReportQueue
可以使用 WinDbg 分析这些文件,您可以从以下链接之一下载该文件:
将其安装在应用程序崩溃的同一台计算机上后,单击文件 >打开故障转储并选择“故障文件夹”中最大的 .tmp 文件(我的有 180 MB)。像这样的东西:
<块引用>
AppCrash_w3wp.exe_3d6ded0d29abf2144c567e08f6b23316ff3a7_cab_849897b9\WER688D.tmp
然后,在刚刚打开的命令窗口中运行以下命令:
最后,生成的输出将包含溢出之前的应用程序堆栈跟踪,您可以轻松追踪导致溢出的原因。就我而言,这是一个有缺陷的日志记录方法:
感谢 Paul White 和他的博客文章: 调试故障应用程序 w3wp.exe 崩溃
This question is a bit old, but I just found a nice way of getting the stack trace of my application just before overflowing and I would like share it with other googlers out there:
When your ASP.NET app crashes, a set of debugging files are dumped in a "crash folder" inside this main folder:
These files can be analysed using WinDbg, which you can download from one of the links below:
After installing it in the same machine where your app crashed, click File > Open Crash Dump and select the largest .tmp file in your "crash folder" (mine had 180 MB). Something like:
Then, run the following commands in the command window that just opened:
Finally, the generated output will contain your app stack trace just before overflowing, and you can easily track down what caused the overflow. In my case it was a buggy logging method:
Thanks to Paul White and his blog post: Debugging Faulting Application w3wp.exe Crashes
w3wp.exe 的默认堆栈限制是一个笑话。我总是用
editbin /stack:9000000 w3wp.exe
提出它,它应该足够了。首先消除堆栈溢出,然后调试您想要的任何内容。A default stack limit for w3wp.exe is a joke. I always raise it with
editbin /stack:9000000 w3wp.exe
, it should be sufficient. Get rid of your stack overflow first, and then debug whatever you want.获取故障转储,针对 Microsoft 的 调试诊断工具并向我们展示结果。
另请查看 http://support.microsoft.com/kb/919789/en-我们,其中详细解释了所有必要的步骤。
Get a crash dump, run it against Microsoft's Debug Diagnostic Tool and show us the result.
Also take a look at http://support.microsoft.com/kb/919789/en-us, which explains all the necessary steps in detail.
在分析任何内存转储之前我会尝试两件事。
Two things I would try before analysing any memory dumps.
您的应用程序在生产与开发中表现不同的一种可能性可能是代码中的预处理器指令,例如
#if DEBUG
。当您部署到生产环境时,发布版本将具有与调试版本不同的代码段。另一种选择是您的应用程序在生产中抛出不相关的异常。并且错误处理代码不知何故最终陷入无限函数调用循环中。您可能想要寻找一个无限循环,该循环具有对其自身的函数调用或调用该函数的另一个函数。由于无限的 for 或 while 循环,这最终会导致无限的函数调用循环。我对“无限”这个词的过度使用表示歉意。
我以前也遇到过这种情况,当我不小心创建了一个属性并在我的属性内返回了该属性时。喜欢:
另外,如果可能的话,您可以在您的 global.asax 的
Application_error
函数中做一些特殊的事情。使用server.getlasterror()
获取异常并记录/显示堆栈跟踪。您可能希望对任何innerexception
或innerexception
的innerexception
执行相同的操作,依此类推。您可能已经在做上述事情,但我想提一下以防万一。
另外,从您的跟踪来看,错误似乎发生在
GetSortKey
中。这是您代码中的函数吗?如果是这样,那么你无限的自我召唤可能会从这里开始。希望这有帮助。
One possibility for your application behaving differently in production vs development could be preprocessor directives like
#if DEBUG
in the code. When you deploy to production the release build would have different code segments than your debug build.Another option would be that your application is throwing an unrelated exception in production. And the error handling code somehow ends up in an infinite function calling loop. You may want to look for an infinite loop that has a function call to itself or another function that calls this function back. This ends up in an infinite function callig loop because of the infinite for or while loop. I apologize for going overboard with the word 'infinite'.
It's also happened to me before when I accidentally created a property and returned the property inside my property. Like:
Also, if possible you could do special stuff with the exception in the
Application_error
function of your global.asax. Useserver.getlasterror()
to get the exception and log/display the stack trace. You may want to do the same for anyinnerexception
s orinnerexception
s ofinnerexception
s and so on.You may already be doing the above mentioned things but I wanted to mention them just in case.
Also, from your trace it looks like the error is happening in
GetSortKey
. Is that a function in your code? If so, then your infinite self calling may start there.Hope this helps.