当我从 .NET 生成新线程时到底发生了什么?

发布于 2024-11-14 12:34:58 字数 410 浏览 1 评论 0原文

我想了解当我在 .NET 中生成新线程时幕后到底发生了什么,如下所示:

Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();

1. 在 CLR 和 Windows 内核中创建了哪些与线程相关的对象?
2. 为什么需要这些对象?
3. 在 x86、x64 Windows 上分配了多少托管/非托管内存(堆和堆栈)?

更新
我正在寻找诸如托管线程对象之类的对象,我假设是t,但也许还有一些其他附加的托管对象; 内核线程对象用户线程环境块等等。

非常感谢!

I want to understand what precisely is happening behind the scene when I spawn a new thread in .NET, something like here:

Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();

1. What thread-related objects are created in CLR and Windows kernel?
2. Why are those objects needed?
3. How much managed/unmanaged memory (heap and stack) is allocated on x86, x64 Windows?

UPDATE
I am looking for such objects as managed thread object, which is I assume is t, but perhaps some other additional managed objects; kernel thread object, user thread environment block and alike.

Many thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

猫性小仙女 2024-11-21 12:34:58

Win32 和内核内存分配

我不太确定 .NET 部分是如何工作的,但如果运行时确实决定使用操作系统创建一个真正的线程,它最终会调用 Win32 API CreateThread in kernel32.dll,可能来自

mscorlib.ni.dll默认情况下,新线程获得 1MB 的堆栈虚拟地址,该地址根据需要提交。这可以通过 maxStackSize 参数 进行控制。主线程的堆栈大小来自可执行文件本身的参数。

在进程的地址空间中,TEB (线程环境块)将被分配(另请参阅)。顺便说一句,x86 上的 FS 寄存器指向线程本地存储和结构化异常处理 (SEH) 等内容。 Win32 可能还分配了其他未记录的内容。

在创建 Win32 线程时,会联系 Win32 服务器进程 (csrss.exe)。您可以看到 csrss 对 Process Explorer 中的所有 Win32 进程和线程开放句柄,以进行某种簿记。

进程中加载​​的 DLL 将收到新线程的通知,并可能分配自己的内存来跟踪线程。

内核将创建一个 ETHREAD [布局](源自 KTHREAD)来自内核的对象非分页池来跟踪线程的状态。还将分配一个内核堆栈(x86 默认为 12k),可以将其调出(除非线程处于内核模式等待状态)。

为什么这么多事情需要为线程分配内存

线程是操作系统提供的最小抢占式调度单元,并且有很多上下文连接到它们。许多不同的组件需要为每个线程提供单独的上下文,因为系统服务需要能够处理同时执行不同操作的多个线程。

有些服务要求您显式地向它们声明新线程,但大多数服务都希望自动使用新线程。有时这意味着在线程启动时分配空间。当线程参与其他服务时,用于跟踪线程的内存量可能会增加,因为这些服务为线程设置了自己的上下文。

分配了多少内存

很难说为一个线程分配了多少内存,因为它分布在多个地址空间和堆上。它会因 Windows 版本、安装的组件以及当前加载到进程中的内容而异。

通常认为最大的成本是新线程默认使用的 1MB 地址空间,但即使是这个限制也可以允许在单个进程中使用数百个地址空间而不会耗尽空间。

如果设计使用的操作系统线程多于系统中 CPU 的数量,则应对其进行审查。具有线程池的工作队列和具有纤程或其他库实现的用户模式调度的轻量级线程应该能够处理多线程,而不需要过多的操作系统线程,从而使线程的内存成本变得不重要。

Win32 and Kernel memory allocated

I'm not exactly sure how the .NET part works, but if the runtime does decide to create a real thread with the OS, it would eventually call the Win32 API CreateThread in kernel32.dll, probably from mscorlib.ni.dll

By default, new threads get 1MB of virtual address for the stack, which is committed as needed. This can be controlled with the maxStackSize parameter. The main thread's stack size comes from a parameter in the executable file itself.

In the process's address space, a TEB (thread environment block) will be allocated (see also). Incidentally, the FS register on x86 points to this for things like thread local storage and structured exception handling (SEH). There are probably other things allocated by Win32 that are not documented.

In creating the Win32 thread, the Win32 server process (csrss.exe) is contacted. You can see that csrss has handles open to all Win32 processes and threads in Process Explorer for some kind of bookkeeping.

DLLs loaded in the process will be notified of the new thread and may allocate their own memory for tracking the thread.

The kernel will create an ETHREAD [layout] (derived from KTHREAD) object from kernel non-paged pool to track the thread's state. There will also be a kernel stack allocated (12k default for x86) which can be paged out (unless the thread is in a kernel mode wait state).

Why so many things need to allocate memory for a thread

Threads are the smallest preemptively scheduled unit that the OS provides and there is a lot of context connected to them. Many different components need to provide separate context for each thread because system services need to be able to deal with multiple threads doing different things all at the same time.

Some services require you to declare new threads to them explicitly but most are expected to work with new threads automatically. Sometimes this means allocating space right when the thread is started. As the thread engages other services, the amount of memory used to track the thread can increase as those services set up their own context for the thread.

How much memory is allocated

It's hard to say how much memory is allocated for a thread since it is spread across several address spaces and heaps. It will vary between Windows versions, installed components and what is loaded into the process currently.

The largest cost is generally accepted to be the 1MB of address space used by default for new threads, but even this limit can allow many hundreds to be used in a single process without running out of space.

If the design is using many more OS threads than the number of CPUs in the system, it should be reviewed. Work queues with a thread pool and lightweight threads with user mode scheduling with fibers or another library's implementation should be able to handle mulithreading without requiring an excessive number of OS threads, rendering the memory cost of the threads to be unimportant.

孤城病女 2024-11-21 12:34:58

所以这是一个非常复杂的问题,并没有一个很好的“x”答案。

  1. CLR 不需要将单个 CLR 线程映射到单个操作系统纤程。所以……这个问题很难回答。我认为当前版本的 .NET (4.0) 尝试在所有操作系统上尽可能使用 CLR 线程和操作系统纤程之间的一对一关系。 .NET 的早期版本(更像 <= 1.1)我不确定所有操作系统上都是如此。调度程序处理大部分这些对象,它们不会成为任何 .NET 对象图的一部分。该调度程序是 CLR 的一部分,而不是 Thread 对象的一部分。如果您深入研究 IL,您将看到许多实际执行的内部调用。
  2. 我假设问题是“为什么需要这些对象?”如果是这样,那是因为操作系统主机实际上必须具有光纤才能在其上执行该线程的代码。 ThreadPool的使用可以大大降低每次创建它们的成本。
  3. 抱歉...取决于。其中很多也是不受管理的,这意味着操作系统主机可以根据负载和系统版本选择以不同方式处理此问题。

“控制线程的逻辑抽象由类库中的 System.Threading.Thread 对象的实例捕获。” http://www.ecma-international.org/publications /files/ECMA-ST/Ecma-335.pdf

所以 EMCA 标准实际上没有提及任何有关该主题的内容。但幸运的是,我们...

“因为 CLR 线程对象是基于纤程的,所以挂在其上的任何信息也是基于纤程的。Thread.ManagedThreadId 返回一个与 CLR 线程一起流动的稳定 ID。它不依赖于物理操作系统线程的标识,这意味着使用它意味着不存在任何形式的关联性,在同一线程上运行的不同光纤会返回不同的 ID。 href="http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx" rel="nofollow">http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR .aspx

So this is a really complicated question that does not really have a great answer of "x".

  1. The CLR is not required to map a single CLR thread to a single OS fiber. So... this is hard to answer. I think the current version of .NET (4.0) attempts to use a 1-to-1 relationship between CLR threads and OS fibers when possible on all OSes. Previous versions of .NET (more like <= 1.1) I'm not sure this was the case on all OSes. The scheduler handles most of the these objects and they won't be part of any .NET object graph. This scheduler is part of the CLR and not part of the Thread object. If you dig into the IL, you'll see many internal calls for actual execution.
  2. I assume the question is "Why are those objects needed?" If so, it's because the OS host has to actually have the fiber to execute the code for that thread on it. ThreadPool usage can greatly reduce this cost of creating them each time.
  3. Sorry... depends. A lot of it unmanaged as well, which means the OS host could choose to handle this differently depending on load and system version.

"The logical abstraction of a thread of control is captured by an instance of the System.Threading.Thread object in the class library." http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf

So EMCA standard really doesn't say anything about the topic. But luckily we have...

"Because the CLR thread object is per-fiber, any information hanging off of it is also per-fiber. Thread.ManagedThreadId returns a stable ID that flows around with the CLR thread. It is not dependent on the identity of the physical OS thread, which means using it implies no form of affinity. Different fibers running on the same thread return different IDs. " From Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx

删除会话 2024-11-21 12:34:58

请查看此处;托管(即 CLR)原语和非托管(即 NT 内核)原语之间的映射可以回答您的大多数问题。

Look here; there is a mapping between managed (i.e. CLR) primitives and unmanaged (i.e. NT kernel) ones that may answer most of your questions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文