Mathematica 中的尾部调用优化？

发布于 2024-10-08 11:08:30 字数 1231 浏览 7 评论 0原文

在制定另一个SO问题的答案时，我遇到了一些关于尾递归的奇怪行为在数学中。

Mathematica 文档暗示尾部调用优化可能会被执行。但我自己的实验给出了相互矛盾的结果。例如，对比以下两个表达式。第一个使 7.0.1 内核崩溃，大概是由于堆栈耗尽：

(* warning: crashes the kernel! *)
Module[{f, n = 0},
  f[x_] := (n += 1; f[x + 1]);
  TimeConstrained[Block[{$RecursionLimit = Infinity}, f[0]], 300, n]
]

第二个运行完成，似乎利用尾调用优化返回有意义的结果：

Module[{f, n = 0},
  f[x_] := Null /; (n += 1; False);
  f[x_] := f[x + 1];
  TimeConstrained[Block[{$IterationLimit = Infinity}, f[0]], 300, n]
]

两个表达式都定义了尾递归函数 f。对于第一个函数，Mathematica 显然认为复合语句的存在足以阻止任何尾部调用优化的机会。另请注意，第一个表达式由 $RecursionLimit 控制，第二个表达式由 $IterationLimit 控制——这表明 Mathematica 对这两个表达式的处理方式不同。（注意：上面引用的 SO 答案有一个较少设计的函数，可以成功利用尾部调用优化）。

那么，问题是：有谁知道 Mathematica 在什么情况下执行递归函数的尾部调用优化？最好参考 Mathematica 文档或其他 WRI 材料中的明确声明。也欢迎大家猜测。

原文

While formulating an answer to another SO question, I came across some strange behaviour regarding tail recursion in Mathematica.

The Mathematica documentation hints that tail call optimization might be performed. But my own experiments give conflicting results. Contrast, for example, the following two expressions. The first crashes the 7.0.1 kernel, presumably due to stack exhaustion:

(* warning: crashes the kernel! *)
Module[{f, n = 0},
  f[x_] := (n += 1; f[x + 1]);
  TimeConstrained[Block[{$RecursionLimit = Infinity}, f[0]], 300, n]
]

The second runs to completion, appearing to exploit tail call optimization to return a meaningful result:

Module[{f, n = 0},
  f[x_] := Null /; (n += 1; False);
  f[x_] := f[x + 1];
  TimeConstrained[Block[{$IterationLimit = Infinity}, f[0]], 300, n]
]

Both expressions define a tail recursive function f. In the case of the first function, Mathematica apparently regards the presence of a compound statement enough to defeat any chance of tail call optimization. Also note that the first expression is governed by $RecursionLimit and the second by $IterationLimit -- a sign that Mathematica is treating the two expressions differently. (Note: the SO answer referenced above has a less contrived function that successfully exploits tail call optimization).

So, the question is: does anyone know the circumstances under which Mathematica performs tail-call optimization of recursive functions? A reference to a definitive statement in the Mathematica documentation or other WRI material would be ideal. Speculation is also welcome.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

差↓一点笑了 2024-10-15 11:08:30

我可以总结我个人经历得出的结论，但免责声明以下内容可能不是完全正确的解释。答案似乎在于 Mathematica 调用堆栈和传统调用堆栈之间的差异，这源于 Mathematica 模式定义的函数是真正的规则。因此，没有真正的函数调用。 Mathematica 出于不同的原因需要堆栈：由于正常求值发生在表达式树的底部，因此它必须保留中间表达式，以防（子）表达式的越来越深的部分由于规则应用而被替换（子）表达式的某些部分被替换。表达式从底部生长）。对于定义我们在其他语言中称为非尾递归函数的规则来说尤其如此。因此，Mathematica 中的堆栈是中间表达式的堆栈，而不是函数调用的堆栈。

这意味着，如果作为规则应用的结果，（子）表达式可以被整体重写，则表达式分支不需要保留在表达式堆栈上。这可能就是 Mathematica 中所谓的尾部调用优化 - 这就是为什么在这种情况下我们使用迭代而不是递归（这是规则应用程序和函数调用之间差异的一个很好的例子）。像 f[x_]:=f[x+1] 这样的规则就属于这种类型。然而，如果某些子表达式被重写，产生更多的表达式结构，则表达式必须存储在堆栈上。规则f[x_ /; x < 5] := (n += 1; f[x + 1]) 就是这种类型，它有点隐藏，直到我们回想起 () 代表 CompoundExpression []。示意性地，这里发生的是 f[1] ->复合表达式[n+=1, f[2]] -> CompoundExpression[n+=1,CompoundExpression[n+=1,f[3]]]->etc。尽管对 f 的调用每次都是最后一次，但它发生在完整的 CompoundExpression[] 执行之前，因此仍然必须将其保留在表达式堆栈中。也许有人会说这是一个可以进行优化的地方，为CompoundExpression 破例，但这可能不容易实现。

现在，为了说明我上面示意性描述的堆栈累积过程，让我们限制递归调用的数量：

Clear[n, f, ff, fff];
n = 0;
f[x_ /; x < 5] := (n += 1; f[x + 1]);

ff[x_] := Null /; (n += 1; False);
ff[x_ /; x < 5] := ff[x + 1];

fff[x_ /; x < 5] := ce[n += 1, fff[x + 1]];

跟踪计算：

In[57]:= Trace[f[1],f]
Out[57]= {f[1],n+=1;f[1+1],{f[2],n+=1;f[2+1],{f[3],n+=1;f[3+1],{f[4],n+=1;f[4+1]}}}}

In[58]:= Trace[ff[1],ff]
Out[58]= {ff[1],ff[1+1],ff[2],ff[2+1],ff[3],ff[3+1],ff[4],ff[4+1],ff[5]}

In[59]:= Trace[fff[1],fff]
Out[59]= {fff[1],ce[n+=1,fff[1+1]],{fff[2],ce[n+=1,fff[2+1]],{fff[3],ce[n+=1,fff[3+1]],   
{fff[4],ce[n+=1,fff[4+1]]}}}}

从中您可以看到，表达式堆栈针对 f 和 < 进行累积。 code>fff （后者只是用来表明这是一个通用机制，ce[]只是一些任意的头），但不适用于ff，因为，出于模式匹配的目的，ff 的第一个定义是已尝试但未匹配的规则，第二个定义完全重写了 ff[arg_]，并且不会生成需要进一步重写的更深层次的子部分。因此，最重要的是，您应该分析您的函数，看看它的递归调用是否会从底部增加计算的表达式。如果是，则就 Mathematica 而言，它不是尾递归。

如果不展示如何手动进行尾部调用优化，我的答案就不完整。作为一个例子，让我们考虑 Select 的递归实现。我们将使用 Mathematica 链表来使其相当高效，而不是一个玩具。下面是非尾递归实现的代码：

Clear[toLinkedList, test, selrecBad, sel, selrec, selTR]
toLinkedList[x_List] := Fold[{#2, #1} &, {}, Reverse[x]];
selrecBad[fst_?test, rest_List] := {fst,If[rest === {}, {}, selrecBad @@ rest]};
selrecBad[fst_, rest_List] := If[rest === {}, {}, selrecBad @@ rest];
sel[x_List, testF_] := Block[{test = testF}, Flatten[selrecBad @@ toLinkedList[x]]]

我使用Block和selrecBad的原因是为了更容易使用Trace。现在，这会破坏我机器上的堆栈：

Block[{$RecursionLimit = Infinity}, sel[Range[300000], EvenQ]] // Short // Timing

您可以在小列表上跟踪以了解原因：

In[7]:= Trace[sel[Range[5],OddQ],selrecBad]

Out[7]= {{{selrecBad[1,{2,{3,{4,{5,{}}}}}],{1,If[{2,{3,{4,{5,{}}}}}==={},{},selrecBad@@{2,{3,{4, 
{5,{}}}}}]},{selrecBad[2,{3,{4,{5,{}}}}],If[{3,{4,{5,{}}}}==={},{},selrecBad@@{3,{4,{5, 
{}}}}],selrecBad[3,{4,{5,{}}}],{3,If[{4,{5,{}}}==={},{},selrecBad@@{4,{5,{}}}]},{selrecBad[4,
{5,{}}],If[{5,{}}==={},{},selrecBad@@{5,{}}],selrecBad[5,{}],{5,If[{}==={},{},selrecBad@@{}]}}}}}}

所发生的情况是结果在列表中积累得越来越深。解决方案是不增加结果表达式的深度，实现这一目标的一种方法是使 selrecBad 接受一个额外参数，即累积结果的（链接）列表：

selrec[{fst_?test, rest_List}, accum_List] := 
    If[rest === {}, {accum, fst}, selrec[rest, {accum, fst}]];
selrec[{fst_, rest_List}, accum_List] := 
    If[rest === {}, accum, selrec[rest, accum]]

并相应地修改 main 函数：

selTR[x_List, testF_] := Block[{test = testF}, Flatten[selrec[toLinkedList[x], {}]]]

这将传递我们的功率测试就好了：（

In[14]:= Block[{$IterationLimit= Infinity},selTR[Range[300000],EvenQ]]//Short//Timing

Out[14]= {0.813,{2,4,6,8,10,12,14,16,18,20,
<<149981>>,299984,299986,299988,299990,299992,299994,299996,299998,300000}}

请注意，这里我们必须修改 $IterationLimit，这是一个好兆头）。使用 Trace 揭示了原因：

In[15]:= Trace[selTR[Range[5],OddQ],selrec]

Out[15]= {{{selrec[{1,{2,{3,{4,{5,{}}}}}},{}],If[{2,{3,{4,{5,{}}}}}==={},{{},1},selrec[{2,{3,{4, 
{5,{}}}}},{{},1}]],selrec[{2,{3,{4,{5,{}}}}},{{},1}],If[{3,{4,{5,{}}}}==={},{{},1},selrec[{3, 
{4,{5,{}}}},{{},1}]],selrec[{3,{4,{5,{}}}},{{},1}],If[{4,{5,{}}}==={},{{{},1},3},selrec[{4, 
{5,{}}},{{{},1},3}]],selrec[{4,{5,{}}},{{{},1},3}],If[{5,{}}==={},{{{},1},3},selrec[{5, 
{}},{{{},1},3}]],selrec[{5,{}},{{{},1},3}],If[{}==={},{{{{},1},3},5},selrec[{},{{{{},1},3},5}]]}}}

即该版本不会累积中间表达式的深度，因为结果保存在单独的列表中。

I can summarize the conclusions I was led to by my personal experience, with a disclaimer that what follows may not be the entirely right explanation. The anwer seems to lie in the differences between Mathematica call stack and traditional call stacks, which originates from Mathematica pattern-defined functions being really rules. So, there are no real function calls. Mathematica needs a stack for a different reason: since normal evaluation happens from the bottom of an expression tree, it must keep intermediate expressions in case when deeper and deeper parts of (sub)expressions get replaced as a result of rule application (some parts of an expression grow from the bottom). This is the case, in particular, for rules defining what we'd call non tail-recursive functions in other languages. So, once again, the stack in Mathematica is a stack of intermediate expressions, not function calls.

This means that if, as a result of rule application, an (sub)expression can be rewritten in its entirety, the expression branch need not be kept on the expression stack. This is probably what is referred as tail call optimization in Mathematica - and this is why in such cases we have iteration rather than recursion (this is one very good example of the differences between rule applications and function calls). Rules like f[x_]:=f[x+1] are of this type. If, however, some sub-expression get rewritten, producing more expression structure, then expression must be stored on the stack. The rule f[x_ /; x < 5] := (n += 1; f[x + 1]) is of this type, which is a bit hidden until we recall that ()stand for CompoundExpression[]. Schematically what happens here is f[1] -> CompoundExpression[n+=1, f[2]] -> CompoundExpression[n+=1,CompoundExpression[n+=1,f[3]]]->etc. Even though the call to f is the last every time, it happens before the full CompoundExpression[] executes, so this still must be kept on the expression stack. One could perhaps argue that this is a place where optimization could be made, to make an exception for CompoundExpression, but this is probably not easy to implement.

Now, to illustrate the stack accumulation process which I schematically described above, let us limit the number of recursive calls:

Clear[n, f, ff, fff];
n = 0;
f[x_ /; x < 5] := (n += 1; f[x + 1]);

ff[x_] := Null /; (n += 1; False);
ff[x_ /; x < 5] := ff[x + 1];

fff[x_ /; x < 5] := ce[n += 1, fff[x + 1]];

Tracing the evaluation:

In[57]:= Trace[f[1],f]
Out[57]= {f[1],n+=1;f[1+1],{f[2],n+=1;f[2+1],{f[3],n+=1;f[3+1],{f[4],n+=1;f[4+1]}}}}

In[58]:= Trace[ff[1],ff]
Out[58]= {ff[1],ff[1+1],ff[2],ff[2+1],ff[3],ff[3+1],ff[4],ff[4+1],ff[5]}

In[59]:= Trace[fff[1],fff]
Out[59]= {fff[1],ce[n+=1,fff[1+1]],{fff[2],ce[n+=1,fff[2+1]],{fff[3],ce[n+=1,fff[3+1]],   
{fff[4],ce[n+=1,fff[4+1]]}}}}

What you can see from this is that the expression stack accumulates for f and fff (the latter used just to show that this is a general mechanism, with ce[] just some arbitrary head), but not for ff, because, for the purposes of pattern matching, the first definition for ff is a rule tried but not matched, and the second definition rewrites ff[arg_] in its entirety, and does not generate deeper sub-parts that need further rewriting. So, the bottom line seems that you should analyze your function and see if its recursive calls will grow the evaluated expression from the bottom or not. If yes, it is not tail-recursive as far as Mathematica is concerned.

My answer would not be complete without showing how to do the tail call optimization manually. As an example, let us consider recursive implementation of Select. We will work with Mathematica linked lists to make it reasonably efficient rather than a toy. Below is the code for the non tail-recursive implementation:

Clear[toLinkedList, test, selrecBad, sel, selrec, selTR]
toLinkedList[x_List] := Fold[{#2, #1} &, {}, Reverse[x]];
selrecBad[fst_?test, rest_List] := {fst,If[rest === {}, {}, selrecBad @@ rest]};
selrecBad[fst_, rest_List] := If[rest === {}, {}, selrecBad @@ rest];
sel[x_List, testF_] := Block[{test = testF}, Flatten[selrecBad @@ toLinkedList[x]]]

The reason I use Block and selrecBad is to make it easier to use Trace. Now, this blows the stack on my machine:

Block[{$RecursionLimit = Infinity}, sel[Range[300000], EvenQ]] // Short // Timing

You can trace on small lists to see why:

In[7]:= Trace[sel[Range[5],OddQ],selrecBad]

Out[7]= {{{selrecBad[1,{2,{3,{4,{5,{}}}}}],{1,If[{2,{3,{4,{5,{}}}}}==={},{},selrecBad@@{2,{3,{4, 
{5,{}}}}}]},{selrecBad[2,{3,{4,{5,{}}}}],If[{3,{4,{5,{}}}}==={},{},selrecBad@@{3,{4,{5, 
{}}}}],selrecBad[3,{4,{5,{}}}],{3,If[{4,{5,{}}}==={},{},selrecBad@@{4,{5,{}}}]},{selrecBad[4,
{5,{}}],If[{5,{}}==={},{},selrecBad@@{5,{}}],selrecBad[5,{}],{5,If[{}==={},{},selrecBad@@{}]}}}}}}

What happens is that the result gets accumulated deeper and deeper in the list. The solution is to not grow the depth of the resulting expression, and one way to achieve that is to make selrecBad accept one extra parameter, which is the (linked) list of accumulated results:

selrec[{fst_?test, rest_List}, accum_List] := 
    If[rest === {}, {accum, fst}, selrec[rest, {accum, fst}]];
selrec[{fst_, rest_List}, accum_List] := 
    If[rest === {}, accum, selrec[rest, accum]]

And modify the main function accordingly:

selTR[x_List, testF_] := Block[{test = testF}, Flatten[selrec[toLinkedList[x], {}]]]

This will pass our power test just fine:

In[14]:= Block[{$IterationLimit= Infinity},selTR[Range[300000],EvenQ]]//Short//Timing

Out[14]= {0.813,{2,4,6,8,10,12,14,16,18,20,
<<149981>>,299984,299986,299988,299990,299992,299994,299996,299998,300000}}

(note that here we had to modify $IterationLimit, which is a good sign). And using Trace reveals the reason:

In[15]:= Trace[selTR[Range[5],OddQ],selrec]

Out[15]= {{{selrec[{1,{2,{3,{4,{5,{}}}}}},{}],If[{2,{3,{4,{5,{}}}}}==={},{{},1},selrec[{2,{3,{4, 
{5,{}}}}},{{},1}]],selrec[{2,{3,{4,{5,{}}}}},{{},1}],If[{3,{4,{5,{}}}}==={},{{},1},selrec[{3, 
{4,{5,{}}}},{{},1}]],selrec[{3,{4,{5,{}}}},{{},1}],If[{4,{5,{}}}==={},{{{},1},3},selrec[{4, 
{5,{}}},{{{},1},3}]],selrec[{4,{5,{}}},{{{},1},3}],If[{5,{}}==={},{{{},1},3},selrec[{5, 
{}},{{{},1},3}]],selrec[{5,{}},{{{},1},3}],If[{}==={},{{{{},1},3},5},selrec[{},{{{{},1},3},5}]]}}}

which is, this version does not accumulate the depth of the intermediate expression, since the results are kept in a separate list.

回复收藏 0 原文

吹泡泡o 2024-10-15 11:08:30

这个答案的想法是用一个不会使我们的表达式增长的包装器替换括号 ()。请注意，我们正在寻找替代函数的实际上是CompoundExpression，因为OP正确地指出该函数正在破坏尾递归（另请参阅Leonid的答案）。提供了两种解决方案。这定义了第一个包装器，

SetAttributes[wrapper, HoldRest];
wrapper[first_, fin_] := fin
wrapper[first_, rest__] := wrapper[rest]

然后我们可以

Clear[f]
k = 0;
mmm = 1000;
f[n_ /; n < mmm] := wrapper[k += n, f[n + 1]];
f[mmm] := k + mmm
Block[{$IterationLimit = Infinity}, f[0]]

正确计算 Total[Range[1000]]。

------注意-----

设置 As 会产生误导

wrapper[fin_] := fin;

请注意，在不发生尾递归的情况下

f[x_]:= wrapper[f[x+1]]

（因为具有 HoldRest 的包装器将在应用关联规则之前评估单数参数与包装器[fin_]）。

话又说回来，上面对 f 的定义没有用，因为我们可以简单地写出

f[x_]:= f[x+1]

And 来获得所需的尾递归。

------另一个注意事项-----

如果我们为包装器提供大量参数，它可能会比必要的慢。用户可以选择编写

f[x_]:=wrapper[g1;g2;g3;g4;g5;g6;g7  , f[x+1]]

第二个包装器

第二个包装器将其参数提供给CompoundExpression，因此如果提供了许多参数，则第二个包装器将比第一个包装器更快。这定义了第二个包装器。

SetAttributes[holdLastWrapper, HoldAll]
holdLastWrapper[fin_] := fin
holdLastWrapper[other_, fin_] := 
 Function[Null, #2, HoldRest][other, fin]
holdLastWrapper[others__, fin_] := 
 holdLastWrapper[
  Evaluate[CompoundExpression[others, Unevaluated[Sequence[]]]], fin]

注意：返回（空）序列通常在递归中可能非常有用。另请参阅我的答案

https://mathematica.stackexchange.com/questions/18949 /how-can-i-return-a-sequence

请注意，如果只提供一个参数，此函数仍然有效，因为它具有属性 HoldAll 而不是 HoldRest，因此设置

f[x]:= holdLastWrapper[f[x+1]]

将产生尾递归（包装器不会没有这种行为）。

速度比较

让我们创建一个很好的长指令列表（实际上是一个具有 Head Hold 的表达式）

nnnn = 1000;
incrHeld = 
  Prepend[DeleteCases[Hold @@ ConstantArray[Hold[c++], nnnn], 
    Hold, {2, Infinity}, Heads -> True], Unevaluated[c = 0]];

对于这些指令，我们可以将包装器与 CompoundExpression 的性能（和结果）进行比较

holdLastWrapper @@ incrHeld // Timing
CompoundExpression @@ incrHeld // Timing
wrapper @@ incrHeld // Timing

--> {{0.000856, 999}, {0.000783, 999}, {0.023752, 999}}

结论

如果您不确定尾递归何时发生或您将提供多少个参数，则第二个包装器更好到包装纸。如果您打算向包装器提供 2 个参数，例如，在您意识到第二个包装器所做的所有工作都是提供给CompoundExpression 并且您决定自己执行此操作的情况下，第一个包装器会更好。

-----最后注意----

在CompoundExpression[args, Unevaluated[expr]] 中，expr 仍然在CompoundExpression 被剥离之前被求值，所以这种类型的解决方案没有用处。

The idea of this answer is to replace the brackets () by a wrapper that does not make our expressions grow. Note that the function we are finding an alternative for is really CompoundExpression, as the OP was correct in remarking this function was ruining the tail recursion (see also the answer by Leonid). Two solutions are provided. This defines the first wrapper

SetAttributes[wrapper, HoldRest];
wrapper[first_, fin_] := fin
wrapper[first_, rest__] := wrapper[rest]

We then have that

Clear[f]
k = 0;
mmm = 1000;
f[n_ /; n < mmm] := wrapper[k += n, f[n + 1]];
f[mmm] := k + mmm
Block[{$IterationLimit = Infinity}, f[0]]

Correctly calculates Total[Range[1000]].

------Note-----

Note that it would be misleading to set

wrapper[fin_] := fin;

As in the case

f[x_]:= wrapper[f[x+1]]

Tail recursion does not occur (because of the fact that wrapper, having HoldRest, will evaluate the singular argument before applying the rule associated with wrapper[fin_]).

Then again, the definition above for f is not useful, as one could simply write

f[x_]:= f[x+1]

And have the desired tail recursion.

------Another note-----

In case we supply the wrapper with a lot arguments, it may be slower than necessary. The user may opt to write

f[x_]:=wrapper[g1;g2;g3;g4;g5;g6;g7  , f[x+1]]

Second wrapper

The second wrapper feeds its arguments to CompoundExpression and will therefore be faster than the first wrapper if many arguments are provided. This defines the second wrapper.

SetAttributes[holdLastWrapper, HoldAll]
holdLastWrapper[fin_] := fin
holdLastWrapper[other_, fin_] := 
 Function[Null, #2, HoldRest][other, fin]
holdLastWrapper[others__, fin_] := 
 holdLastWrapper[
  Evaluate[CompoundExpression[others, Unevaluated[Sequence[]]]], fin]

Note: Returning (empty) Sequences might be very useful in recursion in general. See also my answer here

https://mathematica.stackexchange.com/questions/18949/how-can-i-return-a-sequence

Note that this function will still work if only one argument is provided, as it has attribute HoldAll rather than HoldRest, so that setting

f[x]:= holdLastWrapper[f[x+1]]

Will yield a tail recursion (wrapper does not have this behavior).

Speed comparison

Let's create a nice long list (actually an expression with Head Hold) of instructions

nnnn = 1000;
incrHeld = 
  Prepend[DeleteCases[Hold @@ ConstantArray[Hold[c++], nnnn], 
    Hold, {2, Infinity}, Heads -> True], Unevaluated[c = 0]];

For these instructions, we can compare the performance (and outcome) of our wrappers with CompoundExpression

holdLastWrapper @@ incrHeld // Timing
CompoundExpression @@ incrHeld // Timing
wrapper @@ incrHeld // Timing

--> {{0.000856, 999}, {0.000783, 999}, {0.023752, 999}}

Conclusion

The second wrapper is better if you are not exactly sure when tail recursion will happen or how many arguments you will feed to the wrapper. If you are intent on feeding the wrapper 2 arguments, for example in the case where you realize all the second wrapper does is feed to CompoundExpression and you decide to do this yourself, the first wrapper is better.

-----final note----

In CompoundExpression[args, Unevaluated[expr]], expr still gets evaluated before CompoundExpression is stripped, so solutions of this type are no use.

回复收藏 0 原文

~没有更多了~