实施“生成器”支持自定义语言

发布于 2024-08-29 04:04:26 字数 399 浏览 9 评论 0原文

我对语言设计有一点迷恋,目前我正在尝试自己喜欢的语言。 (http://rogeralsing.com/2010/04/14/playing-with - Plastic/

真正让我心碎的一件事是“生成器”和“yield”关键字。 我知道 C# 使用 AST 转换将枚举器方法转换为状态机。

但它在其他语言中是如何工作的呢? 有什么方法可以在不进行 AST 转换的语言中获得生成器支持吗? 例如,像 Python 或 Ruby 这样的语言是否会通过 AST 转换来解决这个问题?

(问题是生成器如何在不同语言的底层实现,而不是如何用其中一种语言编写生成器)

I've got a bit of fettish for language design and I'm currently playing around with my own hobby language. (http://rogeralsing.com/2010/04/14/playing-with-plastic/)

One thing that really makes my mind bleed is "generators" and the "yield" keyword.
I know C# uses AST transformation to transform enumerator methods into statemachines.

But how does it work in other languages?
Is there any way to get generator support in a language w/o AST transformation?
e.g. Does languages like Python or Ruby resort to AST transformations to solve this to?

(The question is how generators are implemented under the hood in different languages, not how to write a generator in one of them)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

风月客 2024-09-05 04:04:26

生成器基本上是半协程,有一些恼人的限制。因此,显然,您可以使用半协程(当然还有完整协程)来实现它们。

如果您没有协程,则可以使用任何其他通用控制流结构。有很多控制流结构是“通用的”,每个控制流结构(包括所有其他通用控制流结构),包括协程,因此生成器可以(或多或少) )只是简单地转化为那个通用结构。

其中最著名的可能是GOTO。只需使用GOTO,您就可以构建任何其他控制流结构:IF-THEN-ELSEWHILE、< code>FOR、REPEAT-UNTILFOREACH、异常、线程、子程序调用、方法调用、函数调用等等,当然还有协程和发电机。

几乎所有的CPU都支持GOTO(尽管在CPU中,他们通常称之为jmp)。事实上,在许多 CPU 中,GOTO唯一的控制流结构,尽管现在至少支持子例程调用 (call),也许通常还内置了一些原始形式的异常处理和/或并发原语(比较和交换)。

另一个众所周知的控制流原语是延续。延续基本上是 GOTO 的一种更结构化、更易于管理且不那么邪恶的变体,在函数式语言中尤其流行。但也有一些低级语言将其控制流基于延续,例如 Parrot 虚拟机使用延续来控制流,我相信在某些研究实验室中甚至有一些基于延续的 CPU。

C 有一种“蹩脚”形式的延续(setjmplongjmp),它们比“真正的”延续功能更弱,也更不容易使用,但它们足够强大来实现生成器(事实上,可以用来实现完整的延续)。

在 Unix 平台上,setcontext 可以用作 setjmp/longjmp 的更强大、更高级别的替代方案。

另一个众所周知的控制流构造是例外,但可能不会因为低级底层构建而想到,其他控制流构造。有一篇论文表明异常可以比延续更强大,因此使得异常本质上等同于 GOTO ,因此具有普遍的强大功能。事实上,异常有时被用作通用控制流结构:Microsoft Volta 项目将 .NET 字节码编译为 JavaScript,使用 JavaScript 异常来实现 .NET 线程和生成器。

不通用,但可能强大到足以实现生成器,这只是简单的尾调用优化。 (不过,我可能是错的。不幸的是,我没有证据。)我认为您可以将生成器转换为一组相互尾递归函数。我知道状态机可以使用尾部调用来实现,所以我很确定生成器也可以,因为毕竟 C# 将生成器实现为状态机。 (我认为这与惰性求值结合起来效果特别好。)

最后但并非最不重要的一点是,在具有具体化调用堆栈的语言中(例如大多数 Smalltalks),您可以构建几乎任何类型的您想要的控制流结构。 (事实上​​,具体化的调用堆栈基本上是相当于功能高级延续的过程低级。)

那么,生成器的其他实现是什么样的?

Lua 本身没有生成器,但它有完整的非对称协程。主要的 C 实现使用 setjmp/longjmp 来实现它们。

Ruby 本身也没有生成器,但它有Enumerator,可以用作生成器。 枚举器不是语言的一部分,它们是一个库功能。 MRI 使用延续实现Enumerator,而延续又使用setjmp/longjmp 实现。 YARV 使用Fiber(Ruby 拼写“协程”的方式)实现Enumerator,并且那些使用setjmp实现代码>/<代码>longjmp。我相信 JRuby 目前使用线程实现枚举器,但一旦 JVM 获得了一些更好的控制流结构,他们就希望切换到更好的东西。

Python 的生成器实际上或多或少是成熟的协程。 CPython 使用 setjmp/longjmp 实现它们。

Generators are basically semi-coroutines with some annoying limitations. So, obviously, you can implement them using semi-coroutines (and full coroutines, of course).

If you don't have coroutines, you can use any of the other universal control flow constructs. There are a lot of control flow constructs that are "universal" in the sense that every control flow construct (including all the other universal control flow constructs), including coroutines and thus generators can be (more or less) trivially transformed into only that universal construct.

The most well-known of those is probably GOTO. With just GOTO, you can build any other control flow construct: IF-THEN-ELSE, WHILE, FOR, REPEAT-UNTIL, FOREACH, exceptions, threads, subroutine calls, method calls, function calls and so on, and of course also coroutines and generators.

Almost all CPUs support GOTO (although in a CPU, they usually call it jmp). In fact, in many CPUs, GOTO is the only control flow construct, although today native support for at least subroutine calls (call) and maybe some primitive form of exception handling and/or concurrency primitive (compare-and-swap) are usually also built in.

Another well-known control flow primitive are continuations. Continuations are basically a more structured, better manageable and less evil variant of GOTO, especially popular in functional languages. But there also some low-level languages that base their control flow on continuations, for example the Parrot Virtual Machine uses continuations for control flow and I believe there are even some continuation-based CPUs in some research lab somewhere.

C has a sort-of "crappy" form of continuations (setjmp and longjmp), that are much less powerful and less easy to use than "real" continuations, but they are plenty powerful enough to implement generators (and in fact, can be used to implement full continuations).

On a Unix platform, setcontext can be used as a more powerful and higher level alternative to setjmp/longjmp.

Another control flow construct that is well-known, but doesn't probably spring to mind as a low-level substrate build other control flow constructs on top of, are exceptions. There is a paper that shows that exceptions can be more powerful than continuations, thus making exceptions essentially equivalent to GOTO and thus universally powerful. And, in fact, exceptions are sometimes used as universal control flow constructs: the Microsoft Volta project, which compiled .NET bytecode to JavaScript, used JavaScript exceptions to implement .NET threads and generators.

Not universal, but probably powerful enough to implement generators is just plain tail call optimization. (I might be wrong, though. I unfortunately don't have a proof.) I think you can transform a generator into a set of mutually tail-recursive functions. I know that state machines can be implemented using tail calls, so I'm pretty sure generators can, too, since, after all, C# implements generators as state machines. (I think this works especially well together with lazy evaluation.)

Last but not least, in a language with a reified call stack (like most Smalltalks for example), you can build pretty much any kind of control flow constructs you want. (In fact, a reified call stack is basically the procedural low-level equivalent to the functional high-level continuation.)

So, what do other implementations of generators look like?

Lua doesn't have generators per se, but it has full asymmetric coroutines. The main C implementation uses setjmp/longjmp to implement them.

Ruby also doesn't have generators per se, but it has Enumerators, that can be used as generators. Enumerators are not part of the language, they are a library feature. MRI implements Enumerators using continuations, which in turn are implemented using setjmp/longjmp. YARV implements Enumerators using Fibers (which is how Ruby spells "coroutines"), and those are implemented using setjmp/longjmp. I believe JRuby currently implements Enumerators using threads, but they want to switch to something better as soon as the JVM gains some better control flow constructs.

Python has generators that are actually more or less full-blown coroutines. CPython implements them using setjmp/longjmp.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文