Erlang 的递归函数不只是 goto 吗?
只是为了把它直接记在我的脑海里。考虑一下 Erlang 代码的这个示例:
test() ->
receive
{From, whatever} ->
%% do something
test();
{From, somethingelse} ->
%% do something else
test();
end.
test() 调用不是只是一个 goto 吗?
我问这个问题是因为在 C 语言中我们了解到,如果执行函数调用,返回位置总是放在堆栈上。我无法想象 Erlang 中一定是这种情况,因为这会导致堆栈溢出。
我们有两种不同的调用函数的方式: goto 和 gosub。 goto 只是将程序流程引导到其他地方,而 gosub 会记住您来自哪里,以便您可以返回。
有了这种思维方式,我可以更容易地了解 Erlang 的递归,因为如果我只是将: test() 作为 goto 读取,那么就完全没有问题了。
因此我的问题是: :Erlang 不是只是使用 goto 而不是记住堆栈上的返回地址吗?
编辑:
只是为了澄清我的观点:
我知道 goto 可以在某些语言中用于跳转。但假设你不做 someFunction() 也可以这样做: goto someFunction() 在第一个示例中,流程返回,在第二个示例中,流程仅在 someFunction 中继续,并且永远不会返回。
因此,我们通过跳转到方法起点来限制正常的 GOTO 行为。
如果你这样看,那么 Erlang 递归函数调用看起来就像一个 goto。
(我认为 goto 是一个函数调用,无法返回您来自的位置)。这正是 Erlang 示例中发生的情况。
Just to get it straight in my head. Consider this example bit of Erlang code:
test() ->
receive
{From, whatever} ->
%% do something
test();
{From, somethingelse} ->
%% do something else
test();
end.
Isn't the test() call, just a goto?
I ask this because in C we learned, if you do a function call, the return location is always put on the stack. I can't imagine this must be the case in Erlang here since this would result in a stackoverflow.
We had 2 different ways of calling functions:
goto and gosub.
goto just steered the program flow somewhere else, and gosub remembered where you came from so you could return.
Given this way of thinking, I can look at Erlang's recursion easier, since if I just read: test() as a goto, then there is no problem at all.
hence my question: isn't :Erlang just using a goto instead of remembering the return address on a stack?
EDIT:
Just to clarify my point:
I know goto's can be used in some languages to jump all over the place. But just supose instead of doing someFunction() you can also do: goto someFunction()
in the first example the flow returns, in the second example the flow just continues in someFunction and never returns.
So we limit the normal GOTO behaviour by just being able to jump to method starting points.
If you see it like this, than the Erlang recursive function call looks like a goto.
(a goto in my opinion is a function call without the ability to return where you came from). Which is exactly what is happening in the Erlang example.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
由于所执行的内务处理,尾递归调用更像是“返回并立即调用另一个函数”而不是 goto。
解决您的最新问题:记录返回点只是调用函数时执行的一小部分内务处理。返回点存储在堆栈帧中,其余部分必须分配和初始化(在正常调用中),包括局部变量和参数。使用尾递归,不需要分配新的帧,也不需要存储返回点(之前的值可以正常工作),但需要执行其余的内务处理。
函数返回时还需要执行一些内务处理,其中包括丢弃局部变量和参数(作为堆栈帧的一部分)并返回到调用点。在尾递归调用期间,当前函数的局部变量在调用新函数之前被丢弃,但不会发生返回。
就像线程允许比进程更轻量级的上下文切换一样,尾调用允许更轻量级的函数调用,因为可以跳过一些内务处理。
Perl 中的“
goto &NAME
”语句更接近符合你的想法,但不完全是,因为它抛弃了当地人。为新调用的函数保留参数。还有一个简单的区别:尾部调用只能跳转到函数入口点,而 goto 可以跳转到大多数地方(某些语言限制 goto 的目标,例如 C,其中 goto 不能跳转到函数之外)。
A tail recursive call is more of a "return and immediately call this other function" than a goto because of the housekeeping that's performed.
Addressing your newest points: recording the return point is just one bit of housekeeping that's performed when a function is called. The return point is stored in the stack frame, the rest of which must be allocated and initialized (in a normal call), including the local variables and parameters. With tail recursion, a new frame doesn't need to be allocated and the return point doesn't need to be stored (the previous value works fine), but the rest of the housekeeping needs to be performed.
There's also housekeeping that needs to be performed when a function returns, which includes discarding locals and parameters (as part of the stack frame) and returning to the call point. During tail recursive call, the locals for the current function are discarded before invoking the new function, but no return happens.
Rather like how threads allow for lighter-weight context switching than processes, tail calls allow for lighter-weight function invocation, since some of the housekeeping can be skipped.
The "
goto &NAME
" statement in Perl is closer to what you're thinking of, but not quite, as it discards locals. Parameters are kept around for the newly invoked function.One more, simple difference: a tail call can only jump to a function entry point, while a goto can jump most anywhere (some languages restrict the target of a goto, such as C, where goto can't jump outside a function).
你是对的,Erlang编译器会检测到这是一个尾递归调用,并且它不会继续在堆栈上移动,而是重用当前函数的堆栈空间。
此外,它还能够检测循环尾递归,例如
也将被优化。
这样做的“不幸”副作用是,当您跟踪函数调用时,您将无法看到尾递归函数的每次调用,而是看到入口点和出口点。
You are correct, the Erlang compiler will detect that it is a tail recursive call, and instead of moving on on the stack, it reuses the current function's stack space.
Furthermore it is also able to detect circular tail-recursion, e.g.
will also be optimized.
The "unfortunate" side-effect of this is that when you are tracing function calls, you will not be able to see each invocation of a tail recursive function, but the entry and exit point.
这里你有两个问题。
首先,不,在这种情况下,您不会面临堆栈溢出的任何危险,因为对 test() 的这些调用都是 尾递归。
其次,不,函数调用不是 goto,它们是函数调用。 :) 让 goto 出现问题的是它绕过了代码中的任何结构。你可以跳出语句、跳入语句、绕过赋值……各种古怪的行为。函数调用不存在这个问题,因为它们有明显的控制流程。
You've got two questions here.
First, no, you're not in any danger of overrunning the stack in this case because these calls to test() are both tail-recursive.
Second, no, function calls are not goto, they're function calls. :) The thing that makes goto problematic is that it bypasses any structure in your code. You can jump out of statements, jump into statements, bypass assignments...all kinds of screwiness. Function calls don't have this problem because they have an obvious flow of control.
我认为这里的区别在于“真正的”goto 和在某些情况下看起来像 goto 的东西之间的区别。在某些特殊情况下,编译器可以检测到在调用另一个函数之前可以自由地清理当前函数的堆栈。这是当该调用是函数中的最后一次调用时。当然,不同之处在于,与任何其他调用一样,您可以将参数传递给新函数。
正如其他人指出的那样,这种优化不仅限于递归调用,还限于所有最后的调用。这用于 FSM 编程的“经典”方式。
I think the difference here is between a "real" goto and what can in some cases seem like a goto. In some special cases the compiler can detect that it is free to cleanup the stack of the current function before calling another function. This is when the call is the last call in a function. The difference is, of course, that as in any other call you can pass arguments to the new function.
As others have pointed out this optimisation is not restricted to recursive calls but to all last calls. This is used in the "classic" way of programming FSMs.
这是一个
goto
,就像if
是goto
和while
是goto
一样。它是使用(道德上等同于)goto
来实现的,但它并没有直接向程序员暴露goto
的全部潜力。It's a
goto
in the same why thatif
isgoto
andwhile
isgoto
. It is implemented using (the moral equivalent of)goto
, but it does not expose the full shoot-self-in-foot potential ofgoto
directly to the programmer.事实上,根据 Guy Steele 的说法,这些递归函数是终极 GOTO。
In fact, these recursive functions are the ultimate GOTO according to Guy Steele.
这是一个更通用的答案,它取代了我之前基于调用堆栈的答案。由于之前的答案已被接受,我不会替换文本。
序言
有些架构没有所谓的“函数”,但确实有类似的东西(消息传递可能称它们为“方法”或“消息处理程序”;基于事件的架构有“事件处理程序”或简称“处理程序”) ”)。对于一般情况,我将使用术语“代码块”和“调用”,尽管(严格来说)“代码块”可以包含不完全是函数的东西。您可以将“调用”的适当变形形式替换为“调用”或“调用”,就像我在某些地方所做的那样。描述调用的体系结构的特征有时称为“样式”,如“连续传递样式”(CPS),尽管这以前不是官方术语。为了防止事情过于抽象,我们将检查调用堆栈、连续传递、消息传递(类似于 OOP)和事件处理调用样式。我应该指定这些样式所使用的模型,但为了节省空间,我将它们省略了。
调用功能
,或者,C 代表延续、协调和上下文,这对我来说已经足够了
Hohpe 标识了调用堆栈风格的三个很好的头韵调用特征:Continuation、Coordination、Context(全部大写以将它们与单词的其他用法区分开)。
这三个特征不一定是独立的;调用风格决定了它们的相互关系。例如,在调用堆栈样式下,协调与延续相关联。 Continuation 和 Context 一般是相连的,因为 Continuation 中涉及到返回值。
Hohpe 的列表不一定详尽,但足以区分尾部调用和 goto。警告:我可能会偏离主题,例如根据 Hohpe 的功能探索调用空间,但我会尽力克制自己。
调用功能任务
每个调用功能都涉及调用代码块时要完成的任务。对于Continuation,被调用的代码块自然地通过调用代码链相关联。当调用代码块时,通过在链的末尾放置对调用代码的引用(“调用引用”)来扩展当前调用链(或“调用链”)(下面更具体地描述此过程) 。考虑到调用还涉及将名称绑定到代码块和参数,我们发现即使是非束缚和纪律的语言也可以有同样的乐趣。
尾部调用
或,答案
或,其余基本上是不必要的
尾部调用都是为了优化延续,它是一个识别何时可以跳过主要延续任务(记录调用引用)的问题。其他功能任务是独立的。 “goto”代表优化延续和上下文的任务。这就是为什么尾部调用不是简单的“goto”的原因。接下来将充实尾部调用在各种调用风格中的样子。
特定调用风格中的尾部调用
不同的风格以不同的结构排列调用链,由于缺乏更好的词,我将其称为“缠结”。我们已经摆脱了意大利面条式的代码,这不是很好吗?
事件处理风格怎么样?
对于事件处理,调用没有响应,处理程序也不会等待,因此“调用链”(如上面所使用的)不是一个有用的概念。您拥有事件的优先级队列(由通道拥有)和订阅(侦听器-处理程序对的列表),而不是混乱。在某些事件驱动架构中,通道是侦听器的属性;每个听众都拥有一个频道,因此频道就成为听众的代名词。调用意味着在通道上触发一个事件,该事件会调用所有订阅的侦听器处理程序;参数作为事件的属性传递。依赖于另一种样式的响应的代码将成为事件处理下的单独处理程序,并具有关联的事件。尾部调用是一个处理程序,它在另一个通道上触发事件,之后不执行任何其他操作。尾调用优化将涉及从第二个通道到第一个通道重新订阅事件侦听器,或者可能让在第一个通道上触发事件的处理程序而不是在第二个通道上触发(这是由程序员而不是编译器进行的优化) /口译员)。这是之前的优化的样子,从未优化的版本开始。
优化版本:
请注意,尾部调用在事件处理下更加棘手(站不住脚?),因为它们必须考虑订阅。如果爱丽丝后来取消订阅 BBC 新闻的“就职典礼”,则也需要取消订阅 CNN 的就职典礼。此外,系统必须确保它不会不恰当地多次为侦听器调用处理程序。在上面的优化示例中,如果 CNN 上的另一个“就职典礼”处理程序在 BBC 新闻上触发“就职典礼”怎么办? Alice 的“聚会”活动将被解雇两次,这可能会给她带来工作上的麻烦。一种解决方案是让 *Bob 在步骤 4 中取消订阅 BBC 新闻中的所有听众“就职典礼”,但随后您会引入另一个错误,其中 Alice 将错过不通过 CNN 提供的就职典礼活动。也许她想同时庆祝美国和英国的就职典礼。出现这些问题是因为我在模型中没有做出区分,可能是基于订阅类型。例如,也许有一种特殊的一次性订阅(例如 System-V 信号处理程序)或某些处理程序自行取消订阅,尾调用优化仅适用于这些情况。
接下来怎么办?
您可以继续更全面地指定调用功能任务。从那里,您可以找出可以进行哪些优化以及何时可以使用它们。也许可以识别其他调用特征。您还可以想到更多调用样式的示例。您还可以探索调用功能之间的依赖关系。例如,同步和异步调用涉及显式耦合或解耦延续和协调。它永远不会结束。
明白这一切了吗?我自己还在努力消化。
参考文献:
Here's a more general answer, which supercedes my earlier answer based on call-stacks. Since the earlier answer has been accepted, I won't replace the text.
Prologue
Some architectures don't have things they call "functions" that are "called", but do have something analogous (messaging may call them "methods" or "message handlers"; event based architectures have "event handlers" or simply "handlers"). I'll be using the terms "code block" and "invocation" for the general case, though (strictly speaking) "code block" can include things that aren't quite functions. You can substitute the appropriately inflected form of "call" for "invocation" or "invoke", as I might in a few places. The features of an architecture that describe invocation are sometimes called "styles", as in "continuation passing style" (CPS), though this isn't previously an official term. To keep things from being too abstract, we'll examine call stack, continuation passing, messaging (à la OOP) and event handling invocation styles. I should specify the models I'm using for these styles, but I'm leaving them out in the interest of space.
Invocation Features
or, C Is For Continuation, Coordination and Context, That's Good Enough For Me
Hohpe identifies three nicely alliterative invocation features of the call-stack style: Continuation, Coordination, Context (all capitalized to distinguish them from other uses of the words).
The three features aren't necessarily independent; invocation style determines their interrelationships. For instance, Coordination is tied to Continuation under the call-stack style. Continuation and Context are connected in general, since return values are involved in Continuation.
Hohpe's list isn't necessarily exhaustive, but it will suffice to distinguish tail-calls from gotos. Warning: I might go off on tangents, such as exploring invocation space based on Hohpe's features, but I'll try to contain myself.
Invocation Feature Tasks
Each invocation feature involves tasks to be completed when invoking a code block. For Continuation, invoked code blocks are naturally related by a chain of invoking code. When a code block is invoked, the current invocation chain (or "call chain") is extended by placing a reference (an "invocation reference") to the invoking code at the end of the chain (this process is described more concretely below). Taking into account invocation also involves binding names to code blocks and parameters, we see even non-bondage-and-discipline languages can have the same fun.
Tail Calls
or, The Answer
or, The Rest Is Basically Unnecessary
Tail calling is all about optimizing Continuation, and it's a matter of recognizing when the main Continuation task (recording an invocation reference) can be skipped. The other feature tasks stand on their own. A "goto" represents optimizing away tasks for Continuation and Context. That's pretty much why a tail call isn't a simple "goto". What follows will flesh out what tail calls look like in various invocation styles.
Tail Calls In Specific Invocation Styles
Different styles arrange invocation chains in different structures, which I'll call a "tangle", for lack of a better word. Isn't it nice that we've gotten away from spaghetti code?
What About Event Handling Style?
With event handling, invocations don't have responses and handlers don't wait, so "invocation chains" (as used above) isn't a useful concept. Instead of a tangle, you have priority queues of events, which are owned by channels, and subscriptions, which are lists of listener-handler pairs. In some event driven architectures, channels are properties of listeners; every listener owns exactly one channel, so channels become synonymous with listeners. Invoking means firing an event on a channel, which invokes all subscribed listener-handlers; parameters are passed as properties of the event. Code that would depend on a response in another style becomes a separate handler under event handling, with an associated event. A tail call would be a handler that fires the event on another channel and does nothing else afterwards. Tail call optimization would involve re-subscribing listeners for the event from the second channel to the first, or possibly having the handler that fired the event on the first channel instead fire on the second channel (an optimization made by the programmer, not the compiler/interpreter). Here's what the former optimization looks like, starting with the un-optimized version.
And the optimized version:
Note tail calls are trickier (untenable?) under event handling because they have to take into account subscriptions. If Alice were later to unsubscribe from "inauguration" on BBC News, the subscription to inauguration on CNN would also need to be canceled. Additionally, the system must ensure it doesn't inappropriately invoke a handler multiple times for a listener. In the above optimized example, what if there's another handler for "inauguration" on CNN that fires "inauguration" on BBC News? Alice's "party" event will be fired twice, which may get her in trouble at work. One solution is to have *Bob unsubscribe all listeners from "inauguration" on BBC News in step 4, but then you introduce another bug wherein Alice will miss inauguration events that don't come via CNN. Maybe she wants to celebrate both the U.S. and British inaugurations. These problems arise because there are distinctions I'm not making in the model, possibly based on types of subscriptions. For instance, maybe there's a special kind of one-shot subscription (like System-V signal handlers) or some handlers unsubscribe themselves, and tail call optimization is only applied in these cases.
What's next?
You could go on to more fully specify invocation feature tasks. From there, you could figure out what optimizations are possible, and when they can be used. Perhaps other invocation features could be identified. You could also think of more examples of invocation styles. You could also explore the dependencies between invocation features. For instance, synchronous and asynchronous invocation involve explicitly coupling or uncoupling Continuation and Coordination. It never ends.
Get all that? I'm still trying to digest it myself.
References:
在这种情况下,可以进行尾部调用优化,因为我们不需要做更多的工作或使用局部变量。所以编译器会将其转换为循环。
In this case it is possible to do tail-call optimization, since we don't need to do more work or make use of local variables. So the compiler will convert this into a loop.
这不是 Erlang 中发生的事情,你可以返回到你原来的地方。
这些调用是尾递归的,这意味着它“有点”是 goto。在尝试理解或编写任何代码之前,请确保您了解尾递归是什么。如果您是 Erlang 新手,那么阅读 Joe Armstrong 的书可能不是一个坏主意。
从概念上讲,如果您使用 test() 调用自己,则使用您传递的任何参数(本例中没有)调用函数的开头,但不会将任何内容添加到堆栈中。因此,所有变量都被丢弃,函数重新开始,但您没有将新的返回指针推入堆栈。因此,它就像 goto 和传统命令式语言风格函数调用(如 C 或 Java 中的函数调用)之间的混合体。但是从调用函数的第一次调用开始,堆栈上仍然有一个条目。因此,当您最终通过返回一个值而不是执行另一个 test() 退出时,该返回位置将从堆栈中弹出,并在调用函数中恢复执行。
That is not what's happening in Erlang, you CAN return to where you came from.
The calls are tail-recursive, which means that it is "sort of" a goto. Make sure you understand what tail-recursion is before you attempt to understand or write any code. Reading Joe Armstrong's book probably isn't a bad idea if you are new to Erlang.
Conceptually, in the case where you call yourself using test() then a call is made to the start of the function using whatever parameters you pass (none in this example) but nothing more is added to the stack. So all your variables are thrown away and the function starts fresh, but you didn't push a new return pointer onto the stack. So it's like a hybrid between a goto and a traditional imperative language style function call like you'd have in C or Java. But there IS still one entry on the stack from the very first call from the calling function. So when you eventually exit by returning a value rather the doing another test() then that return location is popped from the stack and execution resumes in your calling function.