在 JVM 中运行时在 Scala 中使用递归

发布于 2024-11-01 14:55:45 字数 769 浏览 7 评论 0原文

通过在本站和网络上的其他地方搜索，发现 JVM 不支持尾调用优化。因此，这是否意味着如果要在 JVM 上运行，则不应编写如下所示的可能在非常大的输入列表上运行的尾递归 Scala 代码？

// Get the nth element in a list    
def nth[T](n : Int, list : List[T]) : T = list match {
            case Nil => throw new IllegalArgumentException
            case _ if n == 0 => throw new IllegalArgumentException
            case _ :: tail if n == 1 => list.head
            case _ :: tail  => nth(n - 1, tail)
}

Martin Odersky 的 Scala by Examples 包含以下段落，这似乎表明在某些情况或其他环境中递归是合适的：

原则上，尾调用总是可以重用调用方的栈帧功能。然而，一些运行时环境（例如Java VM）缺乏使尾调用的堆栈帧重用更加高效的原语。一、生产品质因此，Scala 实现只需要重用 di- 的堆栈帧直接尾递归函数，其最后一个动作是调用自身。其他尾部调用可能也可以进行优化，但不应该跨实现依赖这一点。

谁能解释一下这一段中间两句是什么意思？

谢谢你！

原文

From searching elsewhere on this site and the web, tail call optimization is not supported by the JVM. Does that therefore mean that tail recursive Scala code such as the following, which may run on very large input lists, should not be written if it is to run on the JVM?

// Get the nth element in a list    
def nth[T](n : Int, list : List[T]) : T = list match {
            case Nil => throw new IllegalArgumentException
            case _ if n == 0 => throw new IllegalArgumentException
            case _ :: tail if n == 1 => list.head
            case _ :: tail  => nth(n - 1, tail)
}

Martin Odersky's Scala by Example contains the following paragragh which seems to suggests that there are circumstances or other environments where recursion is appropriate:

In principle, tail calls can always re-use the stack frame of the calling
function. However, some run-time environments (such as the Java VM) lack the
primitives to make stack frame re-use for tail calls efficient. A production quality
Scala implementation is therefore only required to re-use the stack frame of a di-
rectly tail-recursive function whose last action is a call to itself. Other tail calls might
be optimized also, but one should not rely on this across implementations.

Can anyone explain what this middle two sentences of this paragraph mean?

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

明媚如初 2024-11-08 14:55:45

由于直接尾递归相当于 while 循环，因此您的示例将在 JVM 上高效运行，因为 Scala 编译器只需使用跳转即可将其编译为底层循环。然而，一般 TCO 在 JVM 上不受支持，尽管可以使用 tailcall() 方法来支持使用编译器生成的蹦床进行尾部调用。

为了确保编译器能够正确地将尾递归函数优化为循环，可以使用 scala.annotation.tailrec 注解，如果编译器无法进行所需的优化，这将导致编译器错误：（

import scala.annotation.tailrec

@tailrec def nth[T](n : Int, list : List[T]) : Option[T] = list match {
            case Nil => None
            case _ if n == 0 => None
            case _ :: tail if n == 1 => list.headOption
            case _ :: tail  => nth(n - 1, tail)
}

螺丝 IllegalArgmentException！）

Since direct tail recursion is equivalent to a while loop, your example will run efficiently on the JVM because the Scala compiler can compile this to a loop under the hood, simply using a jump. General TCO however is not supported on the JVM, although there is available the tailcall() method which supports tail calls using compiler-generated trampolines.

To ensure that the compiler can correctly optimize a tail-recursive function to a loop, you can use the scala.annotation.tailrec annotation, which will cause a compiler error if the compiler cannot make the desired optimization:

import scala.annotation.tailrec

@tailrec def nth[T](n : Int, list : List[T]) : Option[T] = list match {
            case Nil => None
            case _ if n == 0 => None
            case _ :: tail if n == 1 => list.headOption
            case _ :: tail  => nth(n - 1, tail)
}

(screw IllegalArgmentException!)

回复收藏 0 原文

冰雪之触 2024-11-08 14:55:45

<块引用>
原则上，尾调用总是可以重用调用方的栈帧
功能。然而，一些运行时环境（例如Java VM）缺乏
使尾调用的堆栈帧重用更加高效的原语。一、生产品质
因此，Scala 实现只需要重用 di 的堆栈帧
直接尾递归函数，其最后一个动作是调用自身。其他尾部调用可能
也可以进行优化，但不应该跨实现依赖这一点。
谁能解释一下这一段中间两句话的意思吗？

尾递归是尾调用的一种特殊情况。直接尾递归是尾递归的一个特例。仅保证直接尾递归得到优化。其他也可能被优化，但这基本上只是编译器优化。作为一种语言特性，Scala 只保证直接尾递归消除。

那么，有什么区别呢？

好吧，尾部调用只是子例程中的最后一个调用：

def a = {
  b
  c
}

在这种情况下，对 c 的调用是尾部调用，对 b 的调用则不是。

尾递归是指尾调用调用之前已经调用过的子例程：

def a = {
  b
}

def b = {
  a
}

这是尾递归：a调用b（尾调用），它又再次调用a。（与下面描述的直接尾递归相反，这有时称为间接尾递归。）

但是，这两个示例都不会被 Scala 优化。或者，更准确地说：Scala 实现允许优化它们，但不要求这样做。这与例如Scheme 不同，在Scheme 中，语言规范保证上述所有情况都将占用O(1) 堆栈空间。

Scala 语言规范仅保证直接尾递归得到优化，即当子例程直接调用自身时，无需中间的其他调用：

def a = {
  b
  a
}

在这种情况下，对 a 的调用是尾调用（因为它是子例程中的最后一个调用），它是尾递归（因为它再次调用自身），最重要的是它是直接尾递归，因为a直接调用自身而无需先经过另一个调用。

请注意，有许多微妙的事情可能会导致方法不直接尾递归。例如，如果a被重载，那么递归实际上可能会经历不同的重载，因此将不再是直接的。

实际上，这意味着两件事：

您不能对尾递归方法执行提取方法重构，至少不能包括尾调用，因为这会将直接尾递归方法（将得到优化）变成间接尾递归方法-递归方法（不会得到优化）。
您只能使用直接尾递归。可以使用间接尾递归非常优雅地表达尾递归下降解析器或状态机。

主要原因是，当您的底层执行引擎缺乏强大的控制流操作功能（例如 GOTO、延续、一流的可变堆栈或适当的尾部调用）时，您需要实现自己的堆栈除此之外，使用蹦床、进行全局 CPS 变换或类似的讨厌的东西，以便提供通用的正确尾部调用。所有这些都会严重影响性能或与同一平台上其他代码的互操作性。

或者，正如 Clojure 的创建者 Rich Hickey 在面临同样的问题时所说的那样：“性能、Java 互操作、尾部调用。选两个。” Clojure 和 Scala 都选择在尾部调用上做出妥协，只提供尾部递归，而不提供完整的尾部调用。

长话短说：是的，您发布的具体示例将被优化，因为它是直接尾递归。您可以通过在方法上添加 @tailrec 注释来测试这一点。无论该方法是否得到优化，注释都不会改变，但它确实会改变，但是如果该方法无法优化，您将收到编译错误。

由于上述微妙之处，通常最好在需要优化的方法上添加 @tailrec 注释，既可以得到编译错误，也可以作为提示维护您的代码的其他开发人员。

In principle, tail calls can always re-use the stack frame of the calling
function. However, some runtime environments (such as the Java VM) lack the
primitives to make stack frame re-use for tail calls efficient. A production quality
Scala implementation is therefore only required to re-use the stack frame of a di
rectly tail-recursive function whose last action is a call to itself. Other tail calls might
be optimized also, but one should not rely on this across implementations.
Can anyone explain what this middle two sentences of this paragraph mean?

Tail recursion is a special case of a tail call. Direct tail recursion is a special case of tail recursion. Only direct tail recursion is guaranteed to be optimized. Others may be optimized, too, but that's basically just a compiler optimization. As a language feature, Scala only guarantees direct tail recursion elimination.

So, what's the difference?

Well, a tail call is simply the last call in a subroutine:

def a = {
  b
  c
}

In this case, the call to c is a tail call, the call to b is not.

Tail recursion is when a tail call calls a subroutine that was already called before:

def a = {
  b
}

def b = {
  a
}

This is tail recursion: a calls b (a tail call), which in turn calls a again. (In contrast to the direct tail recursion described below, this is sometimes called indirect tail recursion.)

However, none of the two examples will get optimized by Scala. Or, more precisely: a Scala implementation is allowed to optimize them, but it is not required to do so. This is in contrast to, e.g. Scheme, where the language specification guarantees that all of the above cases will take O(1) stack space.

The Scala Language Specification only guarantees that direct tail recursion is optimized, i.e. when a subroutine directly calls itself with no other calls in between:

def a = {
  b
  a
}

In this case, the call to a is a tail call (because it is the last call in the subroutine), it is tail recursion (because it calls itself again) and most importantly it is direct tail recursion, because a directly calls itself without going through another call first.

Note that there are many subtle things that may lead to a method not being directly tail-recursive. For example, if a is overloaded, then the recursion may actually go through different overloads, and thus would no longer be direct.

In practice, this means two things:

you cannot perform an Extract Method Refactoring on a tail-recursive method, at least not including the tail call, because this would turn a directly tail-recursive method (which will get optimized) into an indirectly tail-recursive method (which will not get optimized).
You can only use direct tail recursion. A tail-recursive descent parser, or a state machine, which can be very elegantly expressed using indirect tail recursion, are out.

The main reason for this is that when your underlying execution engine lacks powerful control flow manipulation features such as GOTO, continuations, first-class mutable stack or proper tail calls, then you need to either implement your own stack on top of it, use trampolines, make a global CPS transform or something similarly nasty, in order to provide generalized proper tail calls. All of these have either severe impact on performance or interoperability with other code on the same platform.

Or, as Rich Hickey, the creator of Clojure, said when he was facing the same problem: "Performance, Java interop, tail calls. Pick two." Both Clojure and Scala chose to compromise on tail calls and provide only tail recursion and not full tail calls.

To cut a long story short: yes, the specific example you posted will be optimized, since it is direct tail recursion. You can test this by putting an @tailrec annotation on the method. The annotation does not change whether or not the method gets optimized, it does however guarantee that you will get a compile error if the method can not be optimized.

Due to the above-mentioned subtleties, it is generally a good idea to put an @tailrec annotation on methods that you need to be optimized, both in order to get a compile error, but also as a hint to other developers maintaining your code.

回复收藏 0 原文