当我从 .NET 生成新线程时到底发生了什么?
我想了解当我在 .NET 中生成新线程时幕后到底发生了什么,如下所示:
Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();
1. 在 CLR 和 Windows 内核中创建了哪些与线程相关的对象?
2. 为什么需要这些对象?
3. 在 x86、x64 Windows 上分配了多少托管/非托管内存(堆和堆栈)?
更新
我正在寻找诸如托管线程对象之类的对象,我假设是t,但也许还有一些其他附加的托管对象; 内核线程对象、用户线程环境块等等。
非常感谢!
I want to understand what precisely is happening behind the scene when I spawn a new thread in .NET, something like here:
Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();
1. What thread-related objects are created in CLR and Windows kernel?
2. Why are those objects needed?
3. How much managed/unmanaged memory (heap and stack) is allocated on x86, x64 Windows?
UPDATE
I am looking for such objects as managed thread object, which is I assume is t, but perhaps some other additional managed objects; kernel thread object, user thread environment block and alike.
Many thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Win32 和内核内存分配
我不太确定 .NET 部分是如何工作的,但如果运行时确实决定使用操作系统创建一个真正的线程,它最终会调用 Win32 API CreateThread in kernel32.dll,可能来自
mscorlib.ni.dll默认情况下,新线程获得 1MB 的堆栈虚拟地址,该地址根据需要提交。这可以通过
maxStackSize
参数 进行控制。主线程的堆栈大小来自可执行文件本身的参数。在进程的地址空间中,TEB (线程环境块)将被分配(另请参阅)。顺便说一句,x86 上的 FS 寄存器指向线程本地存储和结构化异常处理 (SEH) 等内容。 Win32 可能还分配了其他未记录的内容。
在创建 Win32 线程时,会联系 Win32 服务器进程 (csrss.exe)。您可以看到 csrss 对 Process Explorer 中的所有 Win32 进程和线程开放句柄,以进行某种簿记。
进程中加载的 DLL 将收到新线程的通知,并可能分配自己的内存来跟踪线程。
内核将创建一个
ETHREAD
[布局](源自 KTHREAD)来自内核的对象非分页池来跟踪线程的状态。还将分配一个内核堆栈(x86 默认为 12k),可以将其调出(除非线程处于内核模式等待状态)。为什么这么多事情需要为线程分配内存
线程是操作系统提供的最小抢占式调度单元,并且有很多上下文连接到它们。许多不同的组件需要为每个线程提供单独的上下文,因为系统服务需要能够处理同时执行不同操作的多个线程。
有些服务要求您显式地向它们声明新线程,但大多数服务都希望自动使用新线程。有时这意味着在线程启动时分配空间。当线程参与其他服务时,用于跟踪线程的内存量可能会增加,因为这些服务为线程设置了自己的上下文。
分配了多少内存
很难说为一个线程分配了多少内存,因为它分布在多个地址空间和堆上。它会因 Windows 版本、安装的组件以及当前加载到进程中的内容而异。
通常认为最大的成本是新线程默认使用的 1MB 地址空间,但即使是这个限制也可以允许在单个进程中使用数百个地址空间而不会耗尽空间。
如果设计使用的操作系统线程多于系统中 CPU 的数量,则应对其进行审查。具有线程池的工作队列和具有纤程或其他库实现的用户模式调度的轻量级线程应该能够处理多线程,而不需要过多的操作系统线程,从而使线程的内存成本变得不重要。
Win32 and Kernel memory allocated
I'm not exactly sure how the .NET part works, but if the runtime does decide to create a real thread with the OS, it would eventually call the Win32 API CreateThread in kernel32.dll, probably from mscorlib.ni.dll
By default, new threads get 1MB of virtual address for the stack, which is committed as needed. This can be controlled with the
maxStackSize
parameter. The main thread's stack size comes from a parameter in the executable file itself.In the process's address space, a TEB (thread environment block) will be allocated (see also). Incidentally, the FS register on x86 points to this for things like thread local storage and structured exception handling (SEH). There are probably other things allocated by Win32 that are not documented.
In creating the Win32 thread, the Win32 server process (csrss.exe) is contacted. You can see that csrss has handles open to all Win32 processes and threads in Process Explorer for some kind of bookkeeping.
DLLs loaded in the process will be notified of the new thread and may allocate their own memory for tracking the thread.
The kernel will create an
ETHREAD
[layout] (derived from KTHREAD) object from kernel non-paged pool to track the thread's state. There will also be a kernel stack allocated (12k default for x86) which can be paged out (unless the thread is in a kernel mode wait state).Why so many things need to allocate memory for a thread
Threads are the smallest preemptively scheduled unit that the OS provides and there is a lot of context connected to them. Many different components need to provide separate context for each thread because system services need to be able to deal with multiple threads doing different things all at the same time.
Some services require you to declare new threads to them explicitly but most are expected to work with new threads automatically. Sometimes this means allocating space right when the thread is started. As the thread engages other services, the amount of memory used to track the thread can increase as those services set up their own context for the thread.
How much memory is allocated
It's hard to say how much memory is allocated for a thread since it is spread across several address spaces and heaps. It will vary between Windows versions, installed components and what is loaded into the process currently.
The largest cost is generally accepted to be the 1MB of address space used by default for new threads, but even this limit can allow many hundreds to be used in a single process without running out of space.
If the design is using many more OS threads than the number of CPUs in the system, it should be reviewed. Work queues with a thread pool and lightweight threads with user mode scheduling with fibers or another library's implementation should be able to handle mulithreading without requiring an excessive number of OS threads, rendering the memory cost of the threads to be unimportant.
所以这是一个非常复杂的问题,并没有一个很好的“x”答案。
ThreadPool
的使用可以大大降低每次创建它们的成本。“控制线程的逻辑抽象由类库中的 System.Threading.Thread 对象的实例捕获。” http://www.ecma-international.org/publications /files/ECMA-ST/Ecma-335.pdf
所以 EMCA 标准实际上没有提及任何有关该主题的内容。但幸运的是,我们...
“因为 CLR 线程对象是基于纤程的,所以挂在其上的任何信息也是基于纤程的。Thread.ManagedThreadId 返回一个与 CLR 线程一起流动的稳定 ID。它不依赖于物理操作系统线程的标识,这意味着使用它意味着不存在任何形式的关联性,在同一线程上运行的不同光纤会返回不同的 ID。 href="http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx" rel="nofollow">http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR .aspx
So this is a really complicated question that does not really have a great answer of "x".
Thread
object. If you dig into the IL, you'll see many internal calls for actual execution.ThreadPool
usage can greatly reduce this cost of creating them each time."The logical abstraction of a thread of control is captured by an instance of the
System.Threading.Thread
object in the class library." http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdfSo EMCA standard really doesn't say anything about the topic. But luckily we have...
"Because the CLR thread object is per-fiber, any information hanging off of it is also per-fiber. Thread.ManagedThreadId returns a stable ID that flows around with the CLR thread. It is not dependent on the identity of the physical OS thread, which means using it implies no form of affinity. Different fibers running on the same thread return different IDs. " From Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx
请查看此处;托管(即 CLR)原语和非托管(即 NT 内核)原语之间的映射可以回答您的大多数问题。
Look here; there is a mapping between managed (i.e. CLR) primitives and unmanaged (i.e. NT kernel) ones that may answer most of your questions.