您最喜欢的语言如何处理深度递归?

发布于 2024-07-08 12:37:32 字数 827 浏览 13 评论 0原文

我最近开始学习 Python,我很惊讶地发现深度递归限制为 1000(默认情况下)。 如果你将它设置得足够高,大约 30000,它会像 C 一样因分段错误而崩溃。尽管如此,C 似乎要高得多。

(Python 人员很快指出,你总是可以将递归函数转换为迭代函数,而且它们总是更快。这是 100% 正确的。但这并不是我的问题所在。)

我在 Perl 中尝试了相同的实验大约 1000 万次递归它消耗了我所有的 4 GB 内存,我使用 ^C 来停止尝试。 显然 Perl 不使用 C 堆栈,但它在递归时确实使用了大量的内存——考虑到调用函数需要做多少工作,这并不令人震惊。

我在 Pike 中进行了尝试,结果在大约 2 秒内获得了 100,000,000 次递归,这让我非常惊讶。 我不知道它是如何做到的,但我怀疑它将递归扁平化为迭代过程——它在执行此操作时似乎没有消耗任何额外的内存。 [注:Pike 确实会扁平化琐碎的情况,但会在更复杂的情况下出现段错误,或者我是这么被告知的。]

我使用了这些原本无用的函数:

int f(int i, int l) { if(i<l) return f(i+1,l); return i; }

sub f { return f($_[0]+1, $_[1]) if $_[0]<$_[1]; return $_[0] };

def f(i,l):
   if i<l:
     return f(i+1,l)
   return i

我很好奇其他语言(例如 PHP、Ruby、Java、Lua、 Ocaml、Haskell)处理递归以及为什么他们这样处理。 此外,请注意如果函数是“尾递归”,这是否会产生影响(请参阅评论)。

I recently started learning Python and I was rather surprised to find a 1000 deep recursion limit (by default). If you set it high enough, about 30000, it crashes with a segmentation fault just like C. Although, C seems to go quite a lot higher.

(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true. It's not really what my question is about though.)

I tried the same experiment in Perl and somewhere around 10 million recursions it consumed all of my 4 gigs of ram and I used ^C to stop trying. Clearly Perl doesn't use the C stack, but it does use a ridiculous amount of memory when it recurses -- not terribly shocking considering how much work it has to do to call functions.

I tried in Pike and was completely surprised to get 100,000,000 recursions in about 2 seconds. I have no idea how it did that, but I suspect it flattened the recursion to an iterative process -- it doesn't seem to consume any extra memory while it does it. [Note: Pike does flatten trivial cases, but segfaults on more complicated ones, or so I'm told.]

I used these otherwise useless functions:

int f(int i, int l) { if(i<l) return f(i+1,l); return i; }

sub f { return f($_[0]+1, $_[1]) if $_[0]<$_[1]; return $_[0] };

def f(i,l):
   if i<l:
     return f(i+1,l)
   return i

I'm very curious how other languages (e.g., PHP, Ruby, Java, Lua, Ocaml, Haskell) handle recursion and why they handle it that way. Additionally, please note whether it makes a difference if the function is "tail-recursive" (see comment).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

滥情哥ㄟ 2024-07-15 12:37:32

“Python 的人很快就指出,你总是可以将递归函数转换为迭代函数,而且它们总是更快”

这是事实,但如果它真的那么简单,为什么 Python 不为我做呢,以便我的代码看起来尽可能简单? (我这么说并不是为了抨击 Python 实现者,而是因为答案解释了问题)。

递归优化从 14 世纪起就出现在函数式语言中了。 Haskell、CAML、Lisp 实现通常都至少将尾递归函数转换为迭代:基本上,您可以通过发现这是可能的来做到这一点,即可以重新排列函数,以便在递归调用后不使用除返回值之外的局部变量。 如果在返回之前对递归返回值进行了一些工作,则可以采用一个技巧,即引入一个附加的“累加器”参数。 简单来说,这意味着工作可以有效地以“向下”的方式完成,而不是“向上”的方式:搜索“如何使函数尾递归”以获取详细信息。

将尾递归函数转换为循环的实际细节基本上是调整您的调用约定,这样您只需设置参数并跳回函数的开头即可“执行调用”,而无需费心保存所有内容您知道自己永远不会使用的范围内的东西。 用汇编语言来说,如果数据流分析告诉您它们在调用之外未使用,则不必保留调用者保存寄存器,并且堆栈上的任何内容也是如此:您不必移动堆栈指针如果您不介意下一次递归/迭代会乱写“您的”堆栈位,则可以在调用时进行。

与您对 Python 人员的解释相反,将一般递归函数转换为迭代并不简单:例如,如果它是多重递归的,那么在简单的方法中您仍然需要一个堆栈。

不过,对于任意递归函数来说,记忆化是一种有用的技术,如果您对可能的方法感兴趣,您可能会想查找一下。 这意味着每次计算函数时,都会将结果保存在缓存中。 要使用它来优化递归,基本上,如果您的递归函数“向下”计数,并且您记住了它,那么您可以通过添加一个“向上”计数的循环来迭代地评估它,依次计算函数的每个值,直到达到目标。 如果备忘录缓存足够大以容纳您需要的所有值,则这使用很少的堆栈空间:例如,如果 f(n) 取决于 f(n-1)、f(n-2) 和 f(n -3)你只需要缓存中3个值的空间:当你上升时,你可以把梯子踢开。 如果 f(n) 取决于 f(n-1) 和 f(n/2),则缓存中需要大量空间,但仍然小于未优化递归中用于堆栈的空间。

"The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster"

This is true, but if it's really as easy as all that, why doesn't Python do it for me, so that my code can look as simple as possible? (I say this not to slam Python implementers, but because the answer explains the problem).

Recursion optimisations have been present in functional languages since, like, the 14th century or something. Haskell, CAML, Lisp implementations all typically convert at least tail recursive functions to iterations: you basically do this by spotting that it's possible, i.e. that the function can be rearranged such that no local variables other than the return value are used after the recursive call. One trick to make it possible if there's some work done to the recursed return value before return, is to introduce an additional "accumulator" parameter. In simple terms this means the work can effectively be done on the way "down" instead of on the way "up": search around for 'how to make a function tail-recursive' for details.

The actual details of turning a tail recursive function into a loop is basically to jigger with your call convention such you can "perform the call" simply by setting up the parameters and jumping back to the start of the function, without bothering to save all that stuff in scope that you know you won't ever use. In assembler terms, you don't have to preserve caller-saves registers if data-flow analysis tells you they're unused beyond the call, and the same goes for anything on the stack: you don't have to move the stack pointer on a call if you don't mind "your" bit of stack being scribbled over by the next recursion/iteration.

Contrary to how you paraphrased the Python folks, converting a general recursive function to an iteration is not trivial: for example if it's multiply-recursive then in a simple approach you'd still need a stack.

Memoization is a useful technique, though, for arbitrarily recursive functions, that you might like to look up if you're interested in the possible approaches. What it means is that each time a function is evaluated, you stick the result in a cache. To use this to optimize recursion, basically, if your recursive function counts "down", and you memoize it, then you can evaluate it iteratively by adding a loop that counts "up" calculating each value of the function in turn until you reach the target. This uses very little stack space provided that the memo cache is big enough to hold all the values you'll need: for instance if f(n) depends on f(n-1), f(n-2) and f(n-3) you only need space for 3 values in the cache: as you go up you can kick the ladder away. If f(n) depends on f(n-1) and f(n/2), you need lots of space in the cache, but still less than you'd use for stack in an unoptimised recursion.

绮烟 2024-07-15 12:37:32

这更多的是一个实现问题,而不是一个语言问题。 没有什么可以阻止某些(愚钝的)C 编译器实现者将其调用堆栈限制为 1000。有很多小型处理器甚至没有足够的堆栈空间。

(Python 人员很快指出,您始终可以将递归函数转换为迭代函数,而且它们总是更快。这是 100% 正确的。但这并不是我的问题所在。)

也许他们确实这么说,但是这不太正确。 递归总是可以转换为迭代,但有时也需要手动使用堆栈。 在这种情况下,我可以看到递归版本更快(假设您足够聪明,可以进行简单的优化,例如将不需要的声明拉到递归例程之外)。 毕竟,堆栈推送周围的过程调用是一个有界限的问题,您的编译器应该知道如何很好地优化。 另一方面,手动堆栈操作不会在编译器中包含专门的优化代码,并且可能会进行各种用户界面健全性检查,从而占用额外的周期。

情况可能是,Python 中的迭代/堆栈解决方案总是更快。 如果是这样,那就是 Python 的失败,而不是递归的失败。

This is more of an implementation question than a language question. There's nothing stopping some (stoopid) C compiler implementor from also limiting their call stack to 1000. There are a lot of small processors out there that wouldn't have stack space for even that many.

(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true. It's not really what my question is about though.)

Perhaps they do say that, but this isn't quite correct. Recursion can always be converted to iteration, but sometimes it also requires manual use of a stack too. In those circumstances, I could see the recursive version being faster (assuming you are smart enough to make simple optimizations, like pulling unneeded declarations outside of the recursive routine). After all, the stack pushes surrounding procedure calls are a well bounded problem that your compiler should know how to optimize very well. Manual stack operations, on the other hand, are not going to have specialized optimization code in your compiler, and are liable to have all sorts of user-interface sanity checks that will take up extra cycles.

It may be the case that the iterative/stack solution is always faster in Python. If so, that's a failing of Python, not of recursion.

孤云独去闲 2024-07-15 12:37:32

PHP 在死亡之前的默认限制为 100:

致命错误:已达到最大函数嵌套级别“100”,正在中止!

编辑:您可以使用 ini_set('xdebug.max_nesting_level) 更改限制', 100000);,但是如果超过大约 1150 次迭代,PHP 就会崩溃:

[Fri Oct 24 11:39:41 2008] [notice] Parent: child process exited with status 3221225477 -- Restarting.

PHP has a default limit of 100 before it dies:

Fatal error: Maximum function nesting level of '100' reached, aborting!

Edit: You can change the limit with ini_set('xdebug.max_nesting_level', 100000);, but if you go above about 1150 iterations PHP crashes:

[Fri Oct 24 11:39:41 2008] [notice] Parent: child process exited with status 3221225477 -- Restarting.

旧故 2024-07-15 12:37:32

C#/.NET 将在特定情况下使用尾递归。 (C# 编译器不会发出 tailcall 操作码,但 递归

JIT 在某些情况下会实现尾 >还有一篇关于此主题的文章 当然,CLR 正在不断变化,并且在 .NET 3.5 和 3.5SP1 中,它可能在尾部调用方面再次发生变化。

C#/.NET will use tail recursion in a particular set of circumstances. (The C# compiler doesn't emit a tailcall opcode, but the JIT will implement tail recursion in some cases.

Shri Borde also has a post on this topic. Of course, the CLR is continually changing, and with .NET 3.5 and 3.5SP1 it may have changed again with respect to tail calls.

西瑶 2024-07-15 12:37:32

在 F# 交互式控制台中使用以下命令,它在不到一秒的时间内运行:

let rec f i l = 
  match i with 
  | i when i < l -> f (i+1) l
  | _ -> l

f 0 100000000;;

然后我尝试了直接翻译,即

let rec g i l = if i < l then g (i+1) l else l

g 0 100000000;;

相同的结果但不同的编译。

这就是 f 翻译成 C# 时的样子:

int f(int i, int l)
{
  while(true)
  {
    int num = i;
    if(num >= l)
      return l;
    int i = num;
    l = l;
    i = i + 1;
  }
}

g,但是翻译成这样:

int g(int i, int l)
{
  while(i < l)
  {
    l = l;
    i++;
  }
  return l;
}

有趣的是,两个基本相同的函数被不同的渲染F# 编译器。 它还表明 F# 编译器具有尾递归优化。 因此,这应该循环直到 i 达到 32 位整数的限制。

Using the following in the F# interactive console, it ran in less than a second:

let rec f i l = 
  match i with 
  | i when i < l -> f (i+1) l
  | _ -> l

f 0 100000000;;

I then tried a straight translation i.e.

let rec g i l = if i < l then g (i+1) l else l

g 0 100000000;;

Same outcome but different compilation.

This is what f looks like in when translated to C#:

int f(int i, int l)
{
  while(true)
  {
    int num = i;
    if(num >= l)
      return l;
    int i = num;
    l = l;
    i = i + 1;
  }
}

g, however is translated to this:

int g(int i, int l)
{
  while(i < l)
  {
    l = l;
    i++;
  }
  return l;
}

It is interesting that two functions that are fundamentally the same are rendered differently by the F# compiler. It also shows that the F# compiler has tail-recursive optimization. Thus this should loop until i reaches the limit for 32-bit integers.

没有你我更好 2024-07-15 12:37:32

根据这个帖子,大约 5,000,000 使用 java, 1Gb 内存。 (并且,使用热点的“客户端”版本)

那是使用 堆栈 (-Xss)< /a> 300Mo。

使用 -server 选项,可以增加。

还可以尝试优化编译器(例如使用 JET)以减少堆栈开销每一层。

According to this thread, around 5,000,000 with java, 1Gb RAM. (and that, with the 'client' version of the hotspot)

That was with a stack (-Xss) of 300Mo.

With a -server option, that could be increased.

Also one can try to optimize the compiler (with JET for instance) to reduce the stack overhead at each layer.

茶底世界 2024-07-15 12:37:32

在一些非病态的情况下(比如你的),(最新的)Lua将使用 尾调用递归,即。 它只会跳转而不将数据压入堆栈。 所以递归循环的数量几乎可以是无限的。

测试:

function f(i, l)
    if i < l then
        return f(i+1, l)
    end
    return i
end

local val1  = arg[1] or 1
local val2  = arg[2] or 100000000
print(f(val1 + 0, val2 + 0))

还使用:

function g(i, l)
    if i >= l then
        return i
    end
    return g(i+1, l)
end

甚至尝试了交叉递归(f 调用 g 和 g 调用 f...)。

在 Windows 上,Lua 5.1 使用大约 1.1MB(恒定)来运行它,在几秒钟内完成。

In some non pathological cases (like your), (latest) Lua will use tail call recursion, ie. it will just jump without pushing data in stack. So the number of recursion loops can be almost unlimited.

Tested with:

function f(i, l)
    if i < l then
        return f(i+1, l)
    end
    return i
end

local val1  = arg[1] or 1
local val2  = arg[2] or 100000000
print(f(val1 + 0, val2 + 0))

Also with:

function g(i, l)
    if i >= l then
        return i
    end
    return g(i+1, l)
end

and even tried cross-recursion (f calling g and g calling f...).

On Windows, Lua 5.1 uses around 1.1MB (constant) to run this, finishes in a few seconds.

冷默言语 2024-07-15 12:37:32

Visual Dataflex 将发生堆栈溢出。

Visual Dataflex will stack overflow.

演多会厌 2024-07-15 12:37:32

我非常喜欢函数式编程,并且由于大多数语言都实现了尾部调用优化,因此您可以随心所欲地递归 :-P

然而,实际上,我必须使用大量 Java,也大量使用 Python 。 不知道 Java 有什么限制,但对于 Python,我实际上计划(但尚未完成)实现一个装饰器,该装饰器将尾部调用优化装饰函数。
我计划这样做并不是为了优化递归,而是主要作为动态修补 Python 字节码和了解有关 Python 内部结构的更多信息的练习。
这里有一些值得关注的链接: http://lambda-the-ultimate.org/node/1331 和 http://www.rowehl.com/blog/?p=626

I'm quite a fan of functional programming, and since most of those langauges implement tail call optimization, you can recurse as much as you like :-P

However, practically, I have to use a lot of Java and use Python a lot too. No idea what limit Java has, but for Python I had actually planned (but have not yet done it) to implement a decorator which would tail call optimize the decorated function.
I was planning this not to optimize recursion, but mainly as an exercise in dynamically patching Python bytecode and learning more about Pythons internals.
Heres some itneresting links: http://lambda-the-ultimate.org/node/1331 and http://www.rowehl.com/blog/?p=626

久随 2024-07-15 12:37:32

有一种方法可以改进 Perl 代码,使其使用恒定大小的堆栈。 您可以通过使用特殊形式的goto 来完成此操作。

sub f{
  if( $_[0] < $_[1] ){

    # return f( $_[0]+1, $_[1] );

    @_ = ( $_[0]+1, $_[1] );
    goto &f;

  } else {
    return $_[0]
  }
}

第一次调用时,它将在堆栈上分配空间。 然后它将更改其参数,并重新启动子例程,而不向堆栈添加更多内容。 因此,它会假装它从未调用过它的自身,将其变成一个迭代过程。


您还可以使用 Sub::Call::Recur 模块。 这使得代码更容易理解并且更短。

use Sub::Call::Recur;
sub f{
  recur( $_[0]+1, $_[1] ) if $_[0] < $_[1];
  return $_[0];
}

There is a way to improve upon the Perl code, to make it use a constant size stack. You do this by using a special form of goto.

sub f{
  if( $_[0] < $_[1] ){

    # return f( $_[0]+1, $_[1] );

    @_ = ( $_[0]+1, $_[1] );
    goto &f;

  } else {
    return $_[0]
  }
}

When first called it will allocate space on the stack. Then it will change its arguments, and restart the subroutine, without adding anything more to the stack. It will therefore pretend that it never called its self, changing it into an iterative process.


You could also use the Sub::Call::Recur module. Which makes the code easier to understand, and shorter.

use Sub::Call::Recur;
sub f{
  recur( $_[0]+1, $_[1] ) if $_[0] < $_[1];
  return $_[0];
}
初心未许 2024-07-15 12:37:32

clojure 为尾递归“recur”提供了一种特殊形式,它只能用在 ast 的尾部。 否则它的行为就像 java 一样,并且可能会抛出 StackverflowException。

clojure provides a special form for tail recursion "recur" this can only be used in tail places of the ast. Otherwise it behaves like java and will likely throw a StackverflowException.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文