如何实现一个能够识别迭代器的编译器?
我使用迭代器已经有一段时间了,我喜欢它们。
但尽管我认真思考过,我还是想不通“识别迭代器的编译器”是如何实现的。 我也对此进行了研究,但找不到任何资源来解释编译器设计上下文中的情况。
详细来说,大多数有关迭代器的文章都暗示存在某种“魔法”来实现所需的行为。 他们建议编译器维护一个状态机,以便跟踪执行的位置(看到最后一个“yield return”的位置)。 我对迭代器的这个属性特别感兴趣,它可以实现惰性求值。
顺便说一下,我知道什么是状态机,已经上过编译器设计课程,学过龙书。 但显然,我无法将我所学的内容与 csc 的“魔力”联系起来。
任何知识或不同的想法都会受到赞赏。
I have been using iterators for a while and I love them.
But although I have thought hard about it, I could not figure out "how a compiler that recognizes the iterators" be implemented. I have also researched about it, but could not find any resource explaining the situation in the compiler-design context.
To elaborate, most of the articles about Iterators imply there is some sort of 'magic' implementing the desired behaviour. They suggest the compiler maintains a state machine in order to follow where the execution is (where the last 'yield return' is seen). I am especially interested in this property of Iterators that enables the lazy evaluation.
By the way, I know what state machines are, have already taken a compiler design course, studied the Dragon Book. But appearently, I cannot relate what I have studied to the 'magics' of csc.
Any knowledge or differential thoughts are appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
它比看起来简单。 编译器可以将迭代器函数分解为单独的块; 块由
yield
语句划分。状态机只需要跟踪我们当前所在的块,并在下次调用迭代器时直接跳转到该块。 我们还需要跟踪所有局部变量(当然)。
然后,我们需要考虑一些特殊情况,特别是包含yield的循环。 幸运的是,IL(但不是 C# 本身)允许 goto 跳转到循环并恢复它们。
请注意,有一些非常复杂的边缘情况,例如 C# 不允许在
finally
块中使用yield
,因为将函数留在finally
中会非常困难(不可能?) code>yield,然后恢复该函数,执行清理,重新抛出任何异常并保留堆栈跟踪。Eric Lippert 发布了 该过程的深入描述。(也请阅读他链接到的文章!)
It's simpler than it seems. The compiler can decompose the iterator function into individual chunks; chunks are divided by
yield
statements.The state machine just needs to keep track of which chunk we're currently in, and upon next invocation of the iterator, jumps directly to this chunk. We also need to keep track of all local variables (of course).
Then, we need to consider a few special cases, in particular loops containing
yield
s. Fortunately, IL (but not C# itself) allowsgoto
to jump into loops and resume them.Notice that there are some very complicated edge cases, e.g. C# doesn't allow
yield
infinally
blocks because it would be very difficult (impossible?) to leave the function uponyield
, and later resume the function, perform clean-up, re-throw any exception and preserve the stack trace.Eric Lippert has posted an in-depth description of the process. (Read the articles he has linked to, as well!)
我会尝试的一件事是用 C# 编写一个简短的示例,对其进行编译,然后在其上使用 Reflector。 我认为这个“yield return”只是语法糖,所以你应该能够在反汇编器的输出中看到编译器如何处理它。
但是,好吧,我对这些事情了解不多,所以也许我完全错了。
One thing I would try would be to write a short example in C#, compile it, and then use Reflector on it. I think that this "yield return" thing is just syntax sugar, so you should be able to see how the compiler handles it in the output of the disassembler.
But, well, I don't really know much about these things so maybe I'm completely wrong.