.net 应用程序挂起。奇怪的故障转储
生产服务器有一个 24x7 运行的应用程序,有时它开始消耗接近 50% 的 CPU。我无法在本地重现它,但我用 adplus 做了内存转储。 服务器有Windows 2008 Server 64位,.NET 3.5。该应用程序侦听 MSMQ 并有一个线程池来执行操作,当我进行内存转储时,它应该什么也不运行,但它仍然消耗 CPU。
下面是转储。你知道会发生什么吗?奇怪的是慢线程还说GC已禁用!那是什么?
奇怪的是,我在同一台服务器上有一个 ASP.NET 应用程序,有时会发生同样的事情。
0:000> !threads
ThreadCount: 23
UnstartedThread: 0
BackgroundThread: 6
PendingThread: 0
DeadThread: 5
Hosted Runtime: no
PreEmptive Lock
ID OSID ThreadOBJ State GC GC Alloc Context Domain Count APT Exception
0 1 1ec0 0000000000249c10 a020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
2 2 84c 0000000000253470 b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Finalizer)
3 3 10b8 0000000019d0f900 80a220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Completion Port)
5 4 2184 0000000019d4a550 880b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Completion Port)
6 6 14a0 0000000019d602a0 180b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Worker)
8 7 490 0000000019d7db70 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
9 8 2164 0000000019dae680 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
10 9 ac0 0000000019db4740 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
11 a 24cc 0000000019db51f0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
12 b 1fb4 0000000019db5ca0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
13 c 2408 0000000019db1520 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
14 d 1b44 0000000019db1fd0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
15 e 1280 0000000019db2a80 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
16 f 206c 0000000019db8780 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
17 10 1ff8 0000000019db9230 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
18 11 2548 0000000019dc2120 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
20 18 2588 0000000019dc49d0 180b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Worker)
21 1a 20a0 0000000019dc5570 b220 Disabled 0000000001653d88:0000000001655978 00000000002410b0 1 MTA
XXXX 26 0 0000000019dc89c0 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 2c 0 0000000019dc5b40 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 2b 0 0000000019dc3290 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 5 0 0000000019dc8f90 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 27 0 0000000019dc7850 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
0:000> !runaway
User Mode Time
Thread Time
**21:20a0 0 days 0:27:21.718**
6:14a0 0 days 0:00:01.921
18:2548 0 days 0:00:01.015
2:84c 0 days 0:00:00.890
(18 more....)
0:000> ~21s
*** WARNING: Unable to verify checksum for System.ni.dll
mscorwks!JIT_WriteBarrier_Fast+0x3:
000007fe`f88de5c3 4881fa607f4701 cmp rdx,1477F60h
0:021> !clrstack
OS Thread Id: 0x20a0 (21)
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
Child-SP RetAddr Call Site
000000001b05f090 000007fef651b1ba System.Net.TimerThread+TimerNode..ctor(Callback, System.Object, Int32, System.Object)
000000001b05f0d0 000007fef69a0c3f System.Net.TimerThread+TimerQueue.CreateTimer(Callback, System.Object)
000000001b05f180 000007fef651b73e System.Net.ConnectionPool.CleanupCallbackWrapper(Timer, Int32, System.Object)
000000001b05f1d0 000007fef651b560 System.Net.TimerThread+TimerNode.Fire()
000000001b05f2a0 000007fef64e6388 System.Net.TimerThread+TimerQueue.Fire(Int32 ByRef)
000000001b05f330 000007fef271175b System.Net.TimerThread.ThreadProc()
000000001b05f400 000007fef27a95fd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
000000001b05f450 000007fef88e1552 System.Threading.ThreadHelper.ThreadStart()
0:021> kb
RetAddr : Args to Child : Call Site
000007fe`f6a1b87e : 00000000 (Edited) : mscorwks!JIT_WriteBarrier_Fast+0x3
000007fe`f651b1ba : 00000000 (Edited) : System_ni+0x70b87e
000007fe`f69a0c3f : 00000000 (Edited) : System_ni+0x20b1ba
000007fe`f651b73e : 000007fe (Edited) : System_ni+0x690c3f
000007fe`f651b560 : 00000000 (Edited) : System_ni+0x20b73e
000007fe`f64e6388 : 00000000 (Edited) : System_ni+0x20b560
000007fe`f271175b : 00000000 (Edited) : System_ni+0x1d6388
000007fe`f27a95fd : 00000000 (Edited) : mscorlib_ni+0x2f175b
000007fe`f88e1552 : 00000000 (Edited) : mscorlib_ni+0x3895fd
000007fe`f877e5e3 : 00000000 (Edited) : mscorwks!CallDescrWorker+0x82
000007fe`f878c83f : 00000000 (Edited) : mscorwks!CallDescrWorkerWithHandler+0xd3
000007fe`f887ae8d : 00000000 (Edited) : mscorwks!MethodDesc::CallDescr+0x24f
000007fe`f8839374 : 00000000 (Edited) : mscorwks!ThreadNative::KickOffThread_Worker+0x191
000007fe`f8732045 : 00000000 (Edited) : mscorwks!TypeHandle::GetParent+0x5c
000007fe`f8846139 : 00000000 (Edited) : mscorwks!SVR::gc_heap::make_heap_segment+0x155
000007fe`f8700e15 : 00000000 (Edited) : mscorwks!ZapStubPrecode::GetType+0x39
000007fe`f8700ae7 : 00000000 (Edited) : mscorwks!ThreadNative::KickOffThread+0x401
000007fe`f88614fc : 00000002 (Edited) : mscorwks!ThreadNative::KickOffThread+0xd3
00000000`76bdbe3d : 00000000 (Edited) : mscorwks!Thread::intermediateThreadProc+0x78
00000000`76d16a51 : 00000000 (Edited) : kernel32!BaseThreadInitThunk+0xd
A production server has an application running 24x7, and sometimes it starts consuming near 50% of CPU. I couldn't reproduce it locally, but I did a Memory Dump with adplus.
The server has Windows 2008 Server 64bits, .NET 3.5. The application listens to a MSMQ and has a thread pool to execute operations, when I did the memory dump it was supposed to be running nothing, but it was still consuming CPU.
Below is the Dump. Do you know what can be happening?? It's weird that the slow thread also says GC Disabled! What's that?
It's weird I've an ASP.NET app in the same server and sometimes happens the same thing.
0:000> !threads
ThreadCount: 23
UnstartedThread: 0
BackgroundThread: 6
PendingThread: 0
DeadThread: 5
Hosted Runtime: no
PreEmptive Lock
ID OSID ThreadOBJ State GC GC Alloc Context Domain Count APT Exception
0 1 1ec0 0000000000249c10 a020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
2 2 84c 0000000000253470 b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Finalizer)
3 3 10b8 0000000019d0f900 80a220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Completion Port)
5 4 2184 0000000019d4a550 880b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Completion Port)
6 6 14a0 0000000019d602a0 180b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Worker)
8 7 490 0000000019d7db70 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
9 8 2164 0000000019dae680 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
10 9 ac0 0000000019db4740 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
11 a 24cc 0000000019db51f0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
12 b 1fb4 0000000019db5ca0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
13 c 2408 0000000019db1520 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
14 d 1b44 0000000019db1fd0 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
15 e 1280 0000000019db2a80 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
16 f 206c 0000000019db8780 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
17 10 1ff8 0000000019db9230 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
18 11 2548 0000000019dc2120 200b020 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA
20 18 2588 0000000019dc49d0 180b220 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 MTA (Threadpool Worker)
21 1a 20a0 0000000019dc5570 b220 Disabled 0000000001653d88:0000000001655978 00000000002410b0 1 MTA
XXXX 26 0 0000000019dc89c0 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 2c 0 0000000019dc5b40 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 2b 0 0000000019dc3290 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 5 0 0000000019dc8f90 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
XXXX 27 0 0000000019dc7850 9820 Enabled 0000000000000000:0000000000000000 00000000002410b0 0 Ukn
0:000> !runaway
User Mode Time
Thread Time
**21:20a0 0 days 0:27:21.718**
6:14a0 0 days 0:00:01.921
18:2548 0 days 0:00:01.015
2:84c 0 days 0:00:00.890
(18 more....)
0:000> ~21s
*** WARNING: Unable to verify checksum for System.ni.dll
mscorwks!JIT_WriteBarrier_Fast+0x3:
000007fe`f88de5c3 4881fa607f4701 cmp rdx,1477F60h
0:021> !clrstack
OS Thread Id: 0x20a0 (21)
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
Child-SP RetAddr Call Site
000000001b05f090 000007fef651b1ba System.Net.TimerThread+TimerNode..ctor(Callback, System.Object, Int32, System.Object)
000000001b05f0d0 000007fef69a0c3f System.Net.TimerThread+TimerQueue.CreateTimer(Callback, System.Object)
000000001b05f180 000007fef651b73e System.Net.ConnectionPool.CleanupCallbackWrapper(Timer, Int32, System.Object)
000000001b05f1d0 000007fef651b560 System.Net.TimerThread+TimerNode.Fire()
000000001b05f2a0 000007fef64e6388 System.Net.TimerThread+TimerQueue.Fire(Int32 ByRef)
000000001b05f330 000007fef271175b System.Net.TimerThread.ThreadProc()
000000001b05f400 000007fef27a95fd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
000000001b05f450 000007fef88e1552 System.Threading.ThreadHelper.ThreadStart()
0:021> kb
RetAddr : Args to Child : Call Site
000007fe`f6a1b87e : 00000000 (Edited) : mscorwks!JIT_WriteBarrier_Fast+0x3
000007fe`f651b1ba : 00000000 (Edited) : System_ni+0x70b87e
000007fe`f69a0c3f : 00000000 (Edited) : System_ni+0x20b1ba
000007fe`f651b73e : 000007fe (Edited) : System_ni+0x690c3f
000007fe`f651b560 : 00000000 (Edited) : System_ni+0x20b73e
000007fe`f64e6388 : 00000000 (Edited) : System_ni+0x20b560
000007fe`f271175b : 00000000 (Edited) : System_ni+0x1d6388
000007fe`f27a95fd : 00000000 (Edited) : mscorlib_ni+0x2f175b
000007fe`f88e1552 : 00000000 (Edited) : mscorlib_ni+0x3895fd
000007fe`f877e5e3 : 00000000 (Edited) : mscorwks!CallDescrWorker+0x82
000007fe`f878c83f : 00000000 (Edited) : mscorwks!CallDescrWorkerWithHandler+0xd3
000007fe`f887ae8d : 00000000 (Edited) : mscorwks!MethodDesc::CallDescr+0x24f
000007fe`f8839374 : 00000000 (Edited) : mscorwks!ThreadNative::KickOffThread_Worker+0x191
000007fe`f8732045 : 00000000 (Edited) : mscorwks!TypeHandle::GetParent+0x5c
000007fe`f8846139 : 00000000 (Edited) : mscorwks!SVR::gc_heap::make_heap_segment+0x155
000007fe`f8700e15 : 00000000 (Edited) : mscorwks!ZapStubPrecode::GetType+0x39
000007fe`f8700ae7 : 00000000 (Edited) : mscorwks!ThreadNative::KickOffThread+0x401
000007fe`f88614fc : 00000002 (Edited) : mscorwks!ThreadNative::KickOffThread+0xd3
00000000`76bdbe3d : 00000000 (Edited) : mscorwks!Thread::intermediateThreadProc+0x78
00000000`76d16a51 : 00000000 (Edited) : kernel32!BaseThreadInitThunk+0xd
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
到目前为止,我不是这方面的专家,但更多信息可能有用:
根据这个,GC线程是在CLR启动时创建的,至少对于服务器GC来说是这样,所以
没有足够的线程来运行 GC 甚至是不可能的;-)
“GC”列中的“已禁用”
线程 21 只是意味着它决定了
不被最终的 GC 抢占
手术。发生这种情况时
线程上的代码决定它是
进行关键操作
不应受到 GC 的干扰(例如
加载和组装,从而融合)。
从“kb”命令输出我会
猜测你实际上正在使用
服务器GC(堆栈框架
“mscorwks!SVR::gc_heap::make_heap_segment”;
工作站GC会有一些东西
与类/命名空间“WKS”)。
这并不意外,因为它应该如此
成为“服务器上的默认值”
操作系统。您应该使用“!eeversion”命令来确认这一点。
此外,您应该了解您有多少个核心,因为如果服务器 GC 运行,
它将使用尽可能多的线程(每个逻辑/物理核心一个)。
计时器可能会经常触发,或者比前一个计时器完成的速度更快。您可以使用“!ThreadPool”命令获取线程池线程使用情况的概述。
另外,您可能想检查方法和局部变量的实际参数 (!clrstack -a) 和/或堆栈上的当前对象 (!dso)。也许这可以带来更多启发。
作为疯狂的猜测,一些谷歌搜索“System.Net.ConnectionPool.CleanupCallbackWrapper”会产生以下链接,也许这可能是您的问题?
I'm by far no expert here, but some more information that might be useful:
According to this, GC threads are created on CLR startup, at least for server GC, so
not having enough threads for a GC run is possibly not even possible ;-)
The "Disabled" in the "GC" column of
thread 21 just means that it decided
to not be preemted by an eventual GC
operation. This happends when the
code on thread decides that it is
doing a critical operation that
should not be disturbed by a GC (like
loading and assembly, hence fusion).
From the "kb" command output I would
guess that you are actually using the
server GC (stackframe
"mscorwks!SVR::gc_heap::make_heap_segment";
workstation GC would have something
with the class/namespace "WKS").
This is not unexpected as it should
be the default on a "server
operating" system. You should make sure about this using the "!eeversion" command.
Additionally you should find out how many cores you have, because if the server GC runs,
it will use as many threads (one per logical/physical core).
Could it be, that the timer is firing pretty often, or faster than the previous one is finished. You can get an overview of threadpool thread usage using the "!ThreadPool" command.
Also, you might want to check the actual arguments to the methods and locals (!clrstack -a) and/or the current objects on the stack (!dso). Maybe that can shed some more light.
As wild guess, some googling for "System.Net.ConnectionPool.CleanupCallbackWrapper" yield the following links, maybe that could be your problem?