为什么 OCaml/F# 中的函数默认不是递归的?

发布于 2024-07-21 19:46:57 字数 210 浏览 4 评论 0原文

为什么 F# 和 OCaml(可能还有其他语言)中的函数默认情况下不是递归的?

换句话说,为什么语言设计者认为在如下声明中显式地输入 rec 是个好主意:

let rec foo ... = ...

并且默认情况下不赋予函数递归功能? 为什么需要显式的 rec 构造?

Why is it that functions in F# and OCaml (and possibly other languages) are not by default recursive?

In other words, why did the language designers decide it was a good idea to explicitly make you type rec in a declaration like:

let rec foo ... = ...

and not give the function recursive capability by default? Why the need for an explicit rec construct?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

显式使用 rec 的一个关键原因是与 Hindley-Milner 类型推断有关,它是所有静态类型函数编程语言的基础(尽管以各种方式进行了更改和扩展)。

如果您有一个定义 let fx = x,您会期望它的类型为 'a ->; 'a 并适用于不同点的不同 'a 类型。 但同样,如果您编写 let gx = (x + 1) + ...,您会期望 x 被视为 int > 在g 主体的其余部分中。

Hindley-Milner 推理处理这种区别的方式是通过明确的泛化步骤。 在处理程序时的某些时刻,类型系统会停止并说“好吧,这些定义的类型此时将被泛化,这样当有人使用它们时,其类型中的任何自由类型变量都将是新鲜的 /em> 已实例化,因此不会干扰此定义的任何其他用途。”

事实证明,进行这种概括的合理位置是在检查一组相互递归的函数之后。 如果太早,您就会概括得太多,从而导致类型实际上可能发生冲突的情况。 稍后,您将概括得太少,从而做出无法与多个类型实例化一起使用的定义。

那么,考虑到类型检查器需要知道哪些定义集是相互递归的,它能做什么呢? 一种可能性是简单地对范围内的所有定义进行依赖性分析,并将它们重新排序为尽可能小的组。 Haskell 实际上是这样做的,但在像 F#(以及 OCaml 和 SML)这样具有不受限制的副作用的语言中,这是一个坏主意,因为它也可能对副作用进行重新排序。 因此,它要求用户明确标记哪些定义是相互递归的,从而扩展应该在哪里进行泛化。

One crucial reason for the explicit use of rec is to do with Hindley-Milner type inference, which underlies all staticly typed functional programming languages (albeit changed and extended in various ways).

If you have a definition let f x = x, you'd expect it to have type 'a -> 'a and to be applicable on different 'a types at different points. But equally, if you write let g x = (x + 1) + ..., you'd expect x to be treated as an int in the rest of the body of g.

The way that Hindley-Milner inference deals with this distinction is through an explicit generalisation step. At certain points when processing your program, the type system stops and says "ok, the types of these definitions will be generalised at this point, so that when someone uses them, any free type variables in their type will be freshly instantiated, and thus won't interfere with any other uses of this definition."

It turns out that the sensible place to do this generalisation is after checking a mutually recursive set of functions. Any earlier, and you'll generalise too much, leading to situations where types could actually collide. Any later, and you'll generalise too little, making definitions that can't be used with multiple type instantiations.

So, given that the type checker needs to know about which sets of definitions are mutually recursive, what can it do? One possibility is to simply do a dependency analysis on all the definitions in a scope, and reorder them into the smallest possible groups. Haskell actually does this, but in languages like F# (and OCaml and SML) which have unrestricted side-effects, this is a bad idea because it might reorder the side-effects too. So instead it asks the user to explicitly mark which definitions are mutually recursive, and thus by extension where generalisation should occur.

山有枢 2024-07-28 19:46:57

这是一个好主意有两个关键原因:

首先,如果启用递归定义,则无法引用同名值的先前绑定。 当您执行扩展现有模块之类的操作时,这通常是一个有用的习惯用法。

其次,递归值,尤其是相互递归值的集合,比按顺序进行的定义更难推理,每个新定义都建立在已定义的基础上。 阅读此类代码时最好保证,除了明确标记为递归的定义之外,新定义只能引用以前的定义。

There are two key reasons this is a good idea:

First, if you enable recursive definitions then you can't refer to a previous binding of a value of the same name. This is often a useful idiom when you are doing something like extending an existing module.

Second, recursive values, and especially sets of mutually recursive values, are much harder to reason about then are definitions that proceed in order, each new definition building on top of what has been already defined. It is nice when reading such code to have the guarantee that, except for definitions explicitly marked as recursive, new definitions can only refer to previous definitions.

隱形的亼 2024-07-28 19:46:57

原始 ML 的法国和英国后裔做出了不同的选择,他们的选择几十年来一直传承到现代变体。 所以这只是遗产,但它确实影响了这些语言中的习语。

在法语 CAML 语言系列(包括 OCaml)中,函数默认不是递归的。 这种选择使得在这些语言中使用 let 可以轻松地取代函数(和变量)定义,因为您可以在新定义的主体中引用先前的定义。 F# 从 OCaml 继承了此语法。

例如,在 OCaml 中计算序列的香农熵时取代函数 p

let shannon fold p =
  let p x = p x *. log(p x) /. log 2.0 in
  let p t x = t +. p x in
  -. fold p 0.0

注意参数 p 如何转换为高阶 shannon函数被主体第一行中的另一个 p 取代,然后主体第二行中的另一个 p 取代。

相反,ML 语言家族的英国 SML 分支采取了另一种选择,并且 SML 的 fun 绑定函数默认是递归的。 当大多数函数定义不需要访问其函数名称的先前绑定时,这会导致代码更简单。 但是,被取代的函数使用不同的名称(f1f2 等),这会污染作用域,并可能意外调用函数的错误“版本” 。 现在,隐式递归 fun 绑定函数和非递归 val 绑定函数之间存在差异。

Haskell 可以通过将定义限制为纯定义来推断定义之间的依赖关系。 这使得玩具样品看起来更简单,但在其他方面却付出了巨大的代价。

请注意,Ganesh 和 Eddie 给出的答案是转移注意力的。 他们解释了为什么不能将函数组放置在巨大的 let rec ... and ... 中,因为它会影响类型变量的泛化时间。 这与 SML 中默认的 rec 无关,但与 OCaml 中的默认值无关。

The French and British descendants of the original ML made different choices and their choices have been inherited through the decades to the modern variants. So this is just legacy but it does affect idioms in these languages.

Functions are not recursive by default in the French CAML family of languages (including OCaml). This choice makes it easy to supercede function (and variable) definitions using let in those languages because you can refer to the previous definition inside the body of a new definition. F# inherited this syntax from OCaml.

For example, superceding the function p when computing the Shannon entropy of a sequence in OCaml:

let shannon fold p =
  let p x = p x *. log(p x) /. log 2.0 in
  let p t x = t +. p x in
  -. fold p 0.0

Note how the argument p to the higher-order shannon function is superceded by another p in the first line of the body and then another p in the second line of the body.

Conversely, the British SML branch of the ML family of languages took the other choice and SML's fun-bound functions are recursive by default. When most function definitions do not need access to previous bindings of their function name, this results in simpler code. However, superceded functions are made to use different names (f1, f2 etc.) which pollutes the scope and makes it possible to accidentally invoke the wrong "version" of a function. And there is now a discrepancy between implicitly-recursive fun-bound functions and non-recursive val-bound functions.

Haskell makes it possible to infer the dependencies between definitions by restricting them to be pure. This makes toy samples look simpler but comes at a grave cost elsewhere.

Note that the answers given by Ganesh and Eddie are red herrings. They explained why groups of functions cannot be placed inside a giant let rec ... and ... because it affects when type variables get generalized. This has nothing to do with rec being default in SML but not OCaml.

九厘米的零° 2024-07-28 19:46:57

一些猜测:

  • let 不仅用于绑定函数,还用于绑定其他常规值。 大多数形式的值不允许递归。 允许某些形式的递归值(例如函数、惰性表达式等),因此需要显式语法来指示这一点。
  • 优化非递归函数可能会更容易创建
  • 递归函数时创建的闭包需要包含一个指向函数本身的条目(以便函数可以递归地调用自身),这使得递归闭包比非递归闭包更复杂关闭。 因此,当您不需要递归时,能够创建更简单的非递归闭包可能会很好。
  • 它允许您根据先前定义的函数或同名值来定义函数; 虽然我认为这是不好的做法
  • 额外的安全? 确保你正在做你想做的事情。 例如,如果您不希望它是递归的,但您不小心在函数内部使用了与函数本身同名的名称,它很可能会抱怨(除非该名称之前已经定义过)
  • 。 code> 构造类似于 Lisp 和 Scheme 中的 let 构造; 它们是非递归的。 在Scheme中有一个单独的letrec构造用于递归let

Some guesses:

  • let is not only used to bind functions, but also other regular values. Most forms of values are not allowed to be recursive. Certain forms of recursive values are allowed (e.g. functions, lazy expressions, etc.), so it needs an explicit syntax to indicate this.
  • It might be easier to optimize non-recursive functions
  • The closure created when you create a recursive function needs to include an entry that points to the function itself (so the function can recursively call itself), which makes recursive closures more complicated than non-recursive closures. So it might be nice to be able to create simpler non-recursive closures when you don't need recursion
  • It allows you to define a function in terms of a previously-defined function or value of the same name; although I think this is bad practice
  • Extra safety? Makes sure that you are doing what you intended. e.g. If you don't intend it to be recursive but you accidentally used a name inside the function with the same name as the function itself, it will most likely complain (unless the name has been defined before)
  • The let construct is similar to the let construct in Lisp and Scheme; which are non-recursive. There is a separate letrec construct in Scheme for recursive let's
阪姬 2024-07-28 19:46:57

其中很大一部分是它使程序员能够更好地控制其本地范围的复杂性。 letlet*let rec 的频谱提供了不断增加的功耗和成本水平。 let*let rec 本质上是简单 let 的嵌套版本,因此使用其中任何一个都更昂贵。 此分级允许您对程序的优化进行微观管理,因为您可以选择手头任务所需的级别。 如果您不需要递归或不需要引用先前绑定的能力,那么您可以依靠简单的 let 来节省一些性能。

它类似于Scheme 中的分级相等谓词。 (即 eq?eqv?等于?

A big part of it is that it gives the programmer more control over the complexity of their local scopes. The spectrum of let, let* and let rec offer an increasing level of both power and cost. let* and let rec are in essence nested versions of the simple let, so using either one is more expensive. This grading allows you to micromanage the optimization of your program as you can choose which level of let you need for the task at hand. If you don't need recursion or the ability to refer to previous bindings, then you can fall back on a simple let to save a bit of performance.

It's similar to the graded equality predicates in Scheme. (i.e. eq?, eqv? and equal?)

堇年纸鸢 2024-07-28 19:46:57

给定:

let f x = ... and g y = ...;;

前者

let f a = f (g a)

比较:

let rec f a = f (g a)

重新定义 f,将之前定义的 f 应用到将 g 应用到 a 的结果上。 后者将 f 重新定义为永远循环,将 g 应用于 a,这在 ML 变体中通常不是您想要的。

也就是说,这是语言设计师风格的事情。 放手去做。

Given this:

let f x = ... and g y = ...;;

Compare:

let f a = f (g a)

With this:

let rec f a = f (g a)

The former redefines f to apply the previously defined f to the result of applying g to a. The latter redefines f to loop forever applying g to a, which is usually not what you want in ML variants.

That said, it's a language designer style thing. Just go with it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文