闭包在幕后是如何工作的? (C#)
我觉得我对闭包、如何使用它们以及它们何时有用有相当不错的理解。但我不明白的是它们实际上是如何在内存中幕后工作的。一些示例代码:
public Action Counter()
{
int count = 0;
Action counter = () =>
{
count++;
};
return counter;
}
通常,如果闭包未捕获 {count},则其生命周期将限定在 Counter() 方法内,并且在完成后,它将与 Counter() 的其余堆栈分配一起消失。当它关闭时会发生什么?这次 Counter() 调用的整个堆栈分配是否保留?它是否将 {count} 复制到堆中?它是否实际上从未在堆栈上分配,但被编译器识别为已关闭,因此始终存在于堆上?
对于这个特定问题,我主要感兴趣的是它在 C# 中的工作原理,但不反对与支持闭包的其他语言进行比较。
I feel I have a pretty decent understanding of closures, how to use them, and when they can be useful. But what I don't understand is how they actually work behind the scenes in memory. Some example code:
public Action Counter()
{
int count = 0;
Action counter = () =>
{
count++;
};
return counter;
}
Normally, if {count} was not captured by the closure, its lifecycle would be scoped to the Counter() method, and after it completes it would go away with the rest of the stack allocation for Counter(). What happens though when it is closured? Does the whole stack allocation for this call of Counter() stick around? Does it copy {count} to the heap? Does it never actually get allocated on the stack, but recognized by the compiler as being closured and therefore always lives on the heap?
For this particular question, I'm primarily interested in how this works in C#, but would not be opposed to comparisons against other languages that support closures.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
你的第三个猜测是正确的。编译器会生成这样的代码:
有意义吗?
另外,您还要求进行比较。 VB 和 JScript 都以几乎相同的方式创建闭包。
Your third guess is correct. The compiler will generate code like this:
Make sense?
Also, you asked for comparisons. VB and JScript both create closures in pretty much the same way.
编译器(与运行时相对)创建另一个类/类型。带有闭包的函数以及您关闭/提升/捕获的任何变量都将在整个代码中重写为该类的成员。 .Net 中的闭包是作为该隐藏类的一个实例来实现的。
这意味着您的 count 变量完全是不同类的成员,并且该类的生命周期与任何其他 clr 对象一样;在不再具有 root 权限之前,它不符合垃圾回收的条件。这意味着只要您有对该方法的可调用引用,它就不会去任何地方。
The compiler (as opposed to the runtime) creates another class/type. The function with your closure and any variables you closed over/hoisted/captured are re-written throughout your code as members of that class. A closure in .Net is implemented as one instance of this hidden class.
That means your count variable is a member of a different class entirely, and the lifetime of that class works like any other clr object; it's not eligible for garbage collection until it's no longer rooted. That means as long as you have a callable reference to the method it's not going anywhere.
谢谢@HenkHolterman。由于 Eric 已经解释过,我添加链接只是为了显示编译器为闭包生成的实际类。我想补充一点,C# 编译器创建显示类可能会导致内存泄漏。例如,在函数内部,有一个由 lambda 表达式捕获的 int 变量,还有另一个局部变量,它仅保存对大字节数组的引用。编译器将创建一个显示类实例,该实例将保存对变量(即 int 和字节数组)的引用。但在引用 lambda 之前,字节数组不会被垃圾回收。
Thanks @HenkHolterman. Since it was already explained by Eric, I added the link just to show what actual class the compiler generates for closure. I would like to add to that the creation of display classes by C# compiler can lead to memory leaks. For example inside a function there a int variable that is captured by a lambda expression and there another local variable that simply holds a reference to a large byte array. Compiler would create one display class instance which will hold the references to both the variables i.e. int and the byte array. But the byte array will not be garbage collected till the lambda is being referenced.
埃里克·利珀特的回答确实击中了要点。然而,最好能构建一幅关于堆栈帧和捕获一般如何工作的图片。为此,查看一个稍微复杂的示例会有所帮助。
这是捕获代码:
这是我认为等效的内容(如果我们幸运的话,Eric Lippert 会评论这是否实际上正确):
要点是本地类替代了整个堆栈帧,并且是每次调用 Counter 方法时都会进行相应的初始化。通常,堆栈帧包括对“this”的引用、方法参数以及局部变量。 (当进入控制块时,堆栈帧实际上也会扩展。)
因此,我们不仅仅有一个与捕获的上下文相对应的对象,相反,我们实际上每个捕获的堆栈帧都有一个对象。
基于此,我们可以使用以下心理模型:堆栈帧保存在堆上(而不是堆栈上),而堆栈本身只包含指向堆上堆栈帧的指针。 Lambda 方法包含指向堆栈帧的指针。这是使用托管内存完成的,因此帧会保留在堆上,直到不再需要为止。
显然,当需要堆对象支持 lambda 闭包时,编译器可以仅使用堆来实现这一点。
我喜欢这个模型的地方在于它提供了“收益回报”的综合图景。我们可以认为迭代器方法(使用yield return)就好像它的堆栈帧是在堆上创建的,并且引用指针存储在调用者的局部变量中,以供在迭代期间使用。
Eric Lippert's answer really hits the point. However it would be nice to build a picture of how stack frames and captures work in general. To do this it helps to look at a slightly more complex example.
Here is the capturing code:
And here is what I think the equivalent would be (if we are lucky Eric Lippert will comment on whether this is actually correct or not):
The point is that the local class substitutes for the entire stack frame and is initialized accordingly each time the Counter method is invoked. Typically the stack frame includes a reference to 'this', plus method arguments, plus local variables. (The stack frame is also in effect extended when a control block is entered.)
Consequently we do not have just one object corresponding to the captured context, instead we actually have one object per captured stack frame.
Based on this, we can use the following mental model: stack frames are kept on the heap (instead of on the stack), while the stack itself just contains pointers to the stack frames that are on the heap. Lambda methods contain a pointer to the stack frame. This is done using managed memory, so the frame sticks around on the heap until it is no longer needed.
Obviously the compiler can implement this by only using the heap when the heap object is required to support a lambda closure.
What I like about this model is it provides an integrated picture for 'yield return'. We can think of an iterator method (using yield return) as if it's stack frame were created on the heap and the referencing pointer stored in a local variable in the caller, for use during the iteration.