逻辑线程的数量令人难以置信; Windbg 看不到它们?

发布于 2024-11-07 22:17:27 字数 153 浏览 4 评论 0原文

我有一个进程显示 ~4,294,965,900 个“当前逻辑线程”(根据性能计数器)和 ~400 个物理线程。

我使用 ADPlus (-hang) 创建了内存转储,而 Windbg (!threads) 只显示物理线程。

我如何找出所有这些逻辑线程来自哪里?

I've got a process that is showing ~4,294,965,900 "current logical threads" (according to the performance counters) and ~400 physical threads.

I've created a memory dump using ADPlus (-hang), and windbg (!threads) only shows me the physical threads.

How do I find out where all these logical threads are coming from?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

Spring初心 2024-11-14 22:17:27

对我来说,这个数字似乎高得令人怀疑。

数字-1396表示为无符号32位整数是4,294,965,900,1396看起来更合理。

也许某个地方有错误?

That looks like a suspiciously high number to me.

The number -1396 represented as an unsigned 32-bit integer is 4,294,965,900, and 1396 looks more reasonable.

A bug somewhere, perhaps?

她说她爱他 2024-11-14 22:17:27

如何找出所有这些逻辑线程来自哪里?

他们不是。它们不存在。您根本不可能拥有 40 亿个任何类型的线程,除非您在 64 位计算机上运行,​​哦,至少有几拍字节的 RAM< /em>.

每个线程,无论是“物理”操作系统线程还是由某个框架提供的线程,都至少需要某种标识符。如果这是一个 32 位数字,那么仅仅存储这些标识符就会占用近 16GB 的 RAM。 (当然,您还剩下大约 1600 个未使用的标识符)。如果标识符是 64 位宽,则需要 32GB RAM。最重要的是,每个线程都需要一些堆栈空间(常见的默认值为 1MB,这为我们带来了高达 4PB 的内存)。

这是一个错误。线程不存在,并且性能计数器由于某种原因向您报告垃圾值。

例如,它可能是一个负错误代码,当转换为无符号整数时,它会变成这个巨大的数字。

或者可能是其他一些错误情况。

How do I find out where all these logical threads are coming from?

they aren't. They don't exist. You simply can't have 4 billion threads of any kind, unless you're running on a 64-bit machine with, oh, say a couple of petabyte of RAM at the very least.

Every thread, whether it is a "physical" OS thread or is provided by some framework, need at the very least, some kind of identifier. If that's a 32-bit number then just storing these identifiers will take up nearly 16GB of RAM. (And, of course, you'll have around 1600 unused identifiers left). If the identifiers are 64 bits wide, you need 32GB RAM. On top of that, every thread needs some stack space (a common default is 1MB, which brings us up to 4 petabytes of memory).

It is a bug. The threads don't exist, and the performance counters are reporting a garbage value to you for some reason or other.

For example, it could be a negative error code which, when converted an unsigned integer, becomes this huge number.

Or it could be some other error condition.

_蜘蛛 2024-11-14 22:17:27

由于您的进程正在运行托管代码,因此逻辑线程计数很可能是指 CLR 线程。 .Net在CLR逻辑线程和物理线程之间进行映射。要进一步研究这一点,您可以在 Windbg 中使用 !threads 命令。这是此命令的输出示例:


0:028> !threads
ThreadCount:      25
UnstartedThread:  0
BackgroundThread: 22
PendingThread:    0
DeadThread:       3
Hosted Runtime:   yes
                                   PreEmptive   GC Alloc                Lock
       ID  OSID ThreadOBJ    State GC           Context       Domain   Count APT Exception
   0    1  12b0 007b69d0      4220 Enabled  120337b4:12034a3c 007afef8     0 STA
   6    2  1f70 007c2688      b220 Enabled  11ed2a84:11ed4a3c 007afef8     0 MTA (Finalizer)
   7    3  2340 007c8ac8      1220 Enabled  00000000:00000000 007afef8     0 Ukn
  11    4  1c4c 0aaf3380      7220 Enabled  00000000:00000000 007afef8     0 STA
  13    8  2414 0d4932f0       220 Enabled  00000000:00000000 007afef8     0 Ukn
   3    a  2780 0d4d08e8   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  15    7   970 0d4d0df0   1009220 Enabled  11ed4ad8:11ed6a3c 007afef8     0 MTA (Threadpool Worker)
  19    9  2510 0d4d12f8   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  20    b   80c 0d4d1800   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  21    c  2490 0d4d1d08   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  23    d  2724 0d4d2210   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  24    e  2200 0d4d2718   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  26    f  1f3c 0d4d2c20   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  25   10  200c 0d4d3128   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  27   11  2708 0d4d3630   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  17    6  21b4 0d4d3b38   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  18    5  2148 0d4d4548       220 Enabled  00000000:00000000 007afef8     0 MTA
XXXX   16       0d4d6378     19820 Enabled  00000000:00000000 007afef8     0 MTA
XXXX   15       0d4d5e70     19820 Enabled  00000000:00000000 007afef8     0 MTA
  30   14  112c 0d4d5968   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  32   13  2734 0d4d5460      b220 Enabled  00000000:00000000 007afef8     0 MTA
  33   12  11ec 0d4d4a50   100a220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  34   17  166c 0d4d6880   8009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Completion Port)
  35   18  24f4 0d4d6d88   8009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Completion Port)
XXXX   19       0d4d7798     19820 Enabled  00000000:00000000 007afef8     0 Ukn

请注意,在输出的顶部,它打印出统计信息。如果您发现死线程数量过多,则可能表明存在资源泄漏。查看 此类资源泄漏的一个示例

在 !threads 输出中,左列是非托管线程 ID(与 ~ 命令显示的相同),第二列是 CLR 线程 ID,第三列是操作系统线程 ID。

Since your process is running managed code, chances are the logical thread count refers to CLR threads. .Net does mapping between CLR logical threads and physical threads. To investigate this further, you can use !threads command in Windbg. This is example of output from this command:


0:028> !threads
ThreadCount:      25
UnstartedThread:  0
BackgroundThread: 22
PendingThread:    0
DeadThread:       3
Hosted Runtime:   yes
                                   PreEmptive   GC Alloc                Lock
       ID  OSID ThreadOBJ    State GC           Context       Domain   Count APT Exception
   0    1  12b0 007b69d0      4220 Enabled  120337b4:12034a3c 007afef8     0 STA
   6    2  1f70 007c2688      b220 Enabled  11ed2a84:11ed4a3c 007afef8     0 MTA (Finalizer)
   7    3  2340 007c8ac8      1220 Enabled  00000000:00000000 007afef8     0 Ukn
  11    4  1c4c 0aaf3380      7220 Enabled  00000000:00000000 007afef8     0 STA
  13    8  2414 0d4932f0       220 Enabled  00000000:00000000 007afef8     0 Ukn
   3    a  2780 0d4d08e8   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  15    7   970 0d4d0df0   1009220 Enabled  11ed4ad8:11ed6a3c 007afef8     0 MTA (Threadpool Worker)
  19    9  2510 0d4d12f8   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  20    b   80c 0d4d1800   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  21    c  2490 0d4d1d08   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  23    d  2724 0d4d2210   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  24    e  2200 0d4d2718   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  26    f  1f3c 0d4d2c20   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  25   10  200c 0d4d3128   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  27   11  2708 0d4d3630   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  17    6  21b4 0d4d3b38   1009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  18    5  2148 0d4d4548       220 Enabled  00000000:00000000 007afef8     0 MTA
XXXX   16       0d4d6378     19820 Enabled  00000000:00000000 007afef8     0 MTA
XXXX   15       0d4d5e70     19820 Enabled  00000000:00000000 007afef8     0 MTA
  30   14  112c 0d4d5968   200b220 Enabled  00000000:00000000 007afef8     0 MTA
  32   13  2734 0d4d5460      b220 Enabled  00000000:00000000 007afef8     0 MTA
  33   12  11ec 0d4d4a50   100a220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Worker)
  34   17  166c 0d4d6880   8009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Completion Port)
  35   18  24f4 0d4d6d88   8009220 Enabled  00000000:00000000 007afef8     0 MTA (Threadpool Completion Port)
XXXX   19       0d4d7798     19820 Enabled  00000000:00000000 007afef8     0 Ukn

Note at the top of output it prints out statistics. If you find exessively large number of dead threads, that might indicate resource leaks. Check out one example of this type of resource leak.

In the !threads output the left column is unmanaged thread ID (same as displayed by ~ command), second column is CLR thread ID and third column is OS thread ID.

财迷小姐 2024-11-14 22:17:27

这周我遇到了同样的问题,同样的症状。这是真的。是的,我的服务器令人印象深刻,128G RAM 和 24 个核心。

这里的问题确实是逻辑线程。如果 CLR 可以避免的话,它就不会创建真正的线程。我有一个像 timer.Change(10000, 10000) 一样定期重新激活的计时器,并且在计时器回调内部我的代码挂在网络上,这让 CLR 运行时知道这个“物理线程”可以被重用。然后10秒后再次触发定时器,并创建一个新的逻辑线程,依此类推。下一个问题是我的其余代码彻底使用任务,并且这些也拉动逻辑线程。将所有这些结合起来,在一两周内就会产生数十亿个逻辑线程的连锁反应。

我的修复很简单:使计时器不重复出现,但仅在前一个计时器完成后重新安排下一个计时器触发:
timer.Change(10000,Timeout.Infinite) 并进行计时器回调,以便在合理的超时后取消 io。

I had the same problem this week, same symptoms. It was real. Yes my server is impressive, 128G Ram and 24 cores.

The problem here was logical threads indeed. CLR doesn't create a real thread if it can avoid it. I had a Timer with periodic reactivation like timer.Change(10000, 10000) and inside the timer callback my code hung on network, which let CLR runtime know this 'physical thread' could be reused. Then 10 seconds later the timer is triggered again, and a new logical thread is created, and so on. The next issue is that the rest of my code uses Tasks thoroughly, and those pull also logicalthreads. Combine it all, and you have a ripple effect of billions of logical threads in a week or two.

My fix was easy: make the timer not recurring, but reschedule the next timer trigger only after the previous one is finished:
timer.Change(10000,Timeout.Infinite) and make the timer callback so it cancels the io upon some reasonable timeout.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文