为什么 OCaml/F# 中的函数默认不是递归的?
为什么 F# 和 OCaml(可能还有其他语言)中的函数默认情况下不是递归的?
换句话说,为什么语言设计者认为在如下声明中显式地输入 rec
是个好主意:
let rec foo ... = ...
并且默认情况下不赋予函数递归功能? 为什么需要显式的 rec
构造?
Why is it that functions in F# and OCaml (and possibly other languages) are not by default recursive?
In other words, why did the language designers decide it was a good idea to explicitly make you type rec
in a declaration like:
let rec foo ... = ...
and not give the function recursive capability by default? Why the need for an explicit rec
construct?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
显式使用
rec
的一个关键原因是与 Hindley-Milner 类型推断有关,它是所有静态类型函数编程语言的基础(尽管以各种方式进行了更改和扩展)。如果您有一个定义
let fx = x
,您会期望它的类型为'a ->; 'a
并适用于不同点的不同'a
类型。 但同样,如果您编写let gx = (x + 1) + ...
,您会期望x
被视为int
> 在g
主体的其余部分中。Hindley-Milner 推理处理这种区别的方式是通过明确的泛化步骤。 在处理程序时的某些时刻,类型系统会停止并说“好吧,这些定义的类型此时将被泛化,这样当有人使用它们时,其类型中的任何自由类型变量都将是新鲜的 /em> 已实例化,因此不会干扰此定义的任何其他用途。”
事实证明,进行这种概括的合理位置是在检查一组相互递归的函数之后。 如果太早,您就会概括得太多,从而导致类型实际上可能发生冲突的情况。 稍后,您将概括得太少,从而做出无法与多个类型实例化一起使用的定义。
那么,考虑到类型检查器需要知道哪些定义集是相互递归的,它能做什么呢? 一种可能性是简单地对范围内的所有定义进行依赖性分析,并将它们重新排序为尽可能小的组。 Haskell 实际上是这样做的,但在像 F#(以及 OCaml 和 SML)这样具有不受限制的副作用的语言中,这是一个坏主意,因为它也可能对副作用进行重新排序。 因此,它要求用户明确标记哪些定义是相互递归的,从而扩展应该在哪里进行泛化。
One crucial reason for the explicit use of
rec
is to do with Hindley-Milner type inference, which underlies all staticly typed functional programming languages (albeit changed and extended in various ways).If you have a definition
let f x = x
, you'd expect it to have type'a -> 'a
and to be applicable on different'a
types at different points. But equally, if you writelet g x = (x + 1) + ...
, you'd expectx
to be treated as anint
in the rest of the body ofg
.The way that Hindley-Milner inference deals with this distinction is through an explicit generalisation step. At certain points when processing your program, the type system stops and says "ok, the types of these definitions will be generalised at this point, so that when someone uses them, any free type variables in their type will be freshly instantiated, and thus won't interfere with any other uses of this definition."
It turns out that the sensible place to do this generalisation is after checking a mutually recursive set of functions. Any earlier, and you'll generalise too much, leading to situations where types could actually collide. Any later, and you'll generalise too little, making definitions that can't be used with multiple type instantiations.
So, given that the type checker needs to know about which sets of definitions are mutually recursive, what can it do? One possibility is to simply do a dependency analysis on all the definitions in a scope, and reorder them into the smallest possible groups. Haskell actually does this, but in languages like F# (and OCaml and SML) which have unrestricted side-effects, this is a bad idea because it might reorder the side-effects too. So instead it asks the user to explicitly mark which definitions are mutually recursive, and thus by extension where generalisation should occur.
这是一个好主意有两个关键原因:
首先,如果启用递归定义,则无法引用同名值的先前绑定。 当您执行扩展现有模块之类的操作时,这通常是一个有用的习惯用法。
其次,递归值,尤其是相互递归值的集合,比按顺序进行的定义更难推理,每个新定义都建立在已定义的基础上。 阅读此类代码时最好保证,除了明确标记为递归的定义之外,新定义只能引用以前的定义。
There are two key reasons this is a good idea:
First, if you enable recursive definitions then you can't refer to a previous binding of a value of the same name. This is often a useful idiom when you are doing something like extending an existing module.
Second, recursive values, and especially sets of mutually recursive values, are much harder to reason about then are definitions that proceed in order, each new definition building on top of what has been already defined. It is nice when reading such code to have the guarantee that, except for definitions explicitly marked as recursive, new definitions can only refer to previous definitions.
原始 ML 的法国和英国后裔做出了不同的选择,他们的选择几十年来一直传承到现代变体。 所以这只是遗产,但它确实影响了这些语言中的习语。
在法语 CAML 语言系列(包括 OCaml)中,函数默认不是递归的。 这种选择使得在这些语言中使用
let
可以轻松地取代函数(和变量)定义,因为您可以在新定义的主体中引用先前的定义。 F# 从 OCaml 继承了此语法。例如,在 OCaml 中计算序列的香农熵时取代函数
p
:注意参数
p
如何转换为高阶shannon
函数被主体第一行中的另一个p
取代,然后主体第二行中的另一个p
取代。相反,ML 语言家族的英国 SML 分支采取了另一种选择,并且 SML 的
fun
绑定函数默认是递归的。 当大多数函数定义不需要访问其函数名称的先前绑定时,这会导致代码更简单。 但是,被取代的函数使用不同的名称(f1
、f2
等),这会污染作用域,并可能意外调用函数的错误“版本” 。 现在,隐式递归fun
绑定函数和非递归val
绑定函数之间存在差异。Haskell 可以通过将定义限制为纯定义来推断定义之间的依赖关系。 这使得玩具样品看起来更简单,但在其他方面却付出了巨大的代价。
请注意,Ganesh 和 Eddie 给出的答案是转移注意力的。 他们解释了为什么不能将函数组放置在巨大的
let rec ... and ...
中,因为它会影响类型变量的泛化时间。 这与 SML 中默认的rec
无关,但与 OCaml 中的默认值无关。The French and British descendants of the original ML made different choices and their choices have been inherited through the decades to the modern variants. So this is just legacy but it does affect idioms in these languages.
Functions are not recursive by default in the French CAML family of languages (including OCaml). This choice makes it easy to supercede function (and variable) definitions using
let
in those languages because you can refer to the previous definition inside the body of a new definition. F# inherited this syntax from OCaml.For example, superceding the function
p
when computing the Shannon entropy of a sequence in OCaml:Note how the argument
p
to the higher-ordershannon
function is superceded by anotherp
in the first line of the body and then anotherp
in the second line of the body.Conversely, the British SML branch of the ML family of languages took the other choice and SML's
fun
-bound functions are recursive by default. When most function definitions do not need access to previous bindings of their function name, this results in simpler code. However, superceded functions are made to use different names (f1
,f2
etc.) which pollutes the scope and makes it possible to accidentally invoke the wrong "version" of a function. And there is now a discrepancy between implicitly-recursivefun
-bound functions and non-recursiveval
-bound functions.Haskell makes it possible to infer the dependencies between definitions by restricting them to be pure. This makes toy samples look simpler but comes at a grave cost elsewhere.
Note that the answers given by Ganesh and Eddie are red herrings. They explained why groups of functions cannot be placed inside a giant
let rec ... and ...
because it affects when type variables get generalized. This has nothing to do withrec
being default in SML but not OCaml.一些猜测:
let
不仅用于绑定函数,还用于绑定其他常规值。 大多数形式的值不允许递归。 允许某些形式的递归值(例如函数、惰性表达式等),因此需要显式语法来指示这一点。let
构造; 它们是非递归的。 在Scheme中有一个单独的letrec
构造用于递归letSome guesses:
let
is not only used to bind functions, but also other regular values. Most forms of values are not allowed to be recursive. Certain forms of recursive values are allowed (e.g. functions, lazy expressions, etc.), so it needs an explicit syntax to indicate this.let
construct is similar to thelet
construct in Lisp and Scheme; which are non-recursive. There is a separateletrec
construct in Scheme for recursive let's其中很大一部分是它使程序员能够更好地控制其本地范围的复杂性。
let
、let*
和let rec
的频谱提供了不断增加的功耗和成本水平。let*
和let rec
本质上是简单let
的嵌套版本,因此使用其中任何一个都更昂贵。 此分级允许您对程序的优化进行微观管理,因为您可以选择手头任务所需的级别。 如果您不需要递归或不需要引用先前绑定的能力,那么您可以依靠简单的 let 来节省一些性能。它类似于Scheme 中的分级相等谓词。 (即
eq?
、eqv?
和等于?
)A big part of it is that it gives the programmer more control over the complexity of their local scopes. The spectrum of
let
,let*
andlet rec
offer an increasing level of both power and cost.let*
andlet rec
are in essence nested versions of the simplelet
, so using either one is more expensive. This grading allows you to micromanage the optimization of your program as you can choose which level of let you need for the task at hand. If you don't need recursion or the ability to refer to previous bindings, then you can fall back on a simple let to save a bit of performance.It's similar to the graded equality predicates in Scheme. (i.e.
eq?
,eqv?
andequal?
)给定:
前者
比较:
重新定义
f
,将之前定义的f
应用到将g
应用到a 的结果上
。 后者将f
重新定义为永远循环,将g
应用于a
,这在 ML 变体中通常不是您想要的。也就是说,这是语言设计师风格的事情。 放手去做。
Given this:
Compare:
With this:
The former redefines
f
to apply the previously definedf
to the result of applyingg
toa
. The latter redefinesf
to loop forever applyingg
toa
, which is usually not what you want in ML variants.That said, it's a language designer style thing. Just go with it.