推断类型似乎检测到无限循环,但到底发生了什么?
在 Andrew Koenig 的关于 ML 类型推断的轶事中,作者使用合并排序的实现作为机器学习的学习练习,并很高兴发现一个“不正确”的结果类型推断。
令我惊讶的是,编译器报告了一种类型
'列表->整数列表
换句话说,此排序函数接受任何类型的列表并返回整数列表。
那是不可能的。输出必须是输入的排列;它怎么可能有不同的类型?读者肯定会发现我的第一冲动很熟悉:我想知道我是否发现了编译器中的错误!
经过更多思考,我意识到函数还有另一种方式可以忽略它的参数:也许它有时根本不返回。事实上,当我尝试时,确实发生了这样的情况:
sort(nil)
确实返回了nil
,但是对任何非空列表进行排序将进入无限递归循环。
当翻译成 Haskell
split [] = ([], [])
split [x] = ([x], [])
split (x:y:xs) = (x:s1, y:s2)
where (s1,s2) = split xs
merge xs [] = xs
merge [] ys = ys
merge xx@(x:xs) yy@(y:ys)
| x < y = x : merge xs yy
| otherwise = y : merge xx ys
mergesort [] = []
mergesort xs = merge (mergesort p) (mergesort q)
where (p,q) = split xs
GHC 时,推断出类似的类型:
*Main> :t mergesort
mergesort :: (Ord a) => [t] -> [a]
How does the Damas–Hindley–Milner 算法 推断出这个类型?
In Andrew Koenig’s An anecdote about ML type inference, the author uses implementation of merge sort as a learning exercise for ML and is pleased to find an “incorrect” type inference.
Much to my surprise, the compiler reported a type of
'a list -> int list
In other words, this sort function accepts a list of any type at all and returns a list of integers.
That is impossible. The output must be a permutation of the input; how can it possibly have a different type? The reader will surely find my first impulse familiar: I wondered if I had uncovered a bug in the compiler!
After thinking about it some more, I realized that there was another way in which the function could ignore its argument: perhaps it sometimes didn't return at all. Indeed, when I tried it, that is exactly what happened:
sort(nil)
did returnnil
, but sorting any non-empty list would go into an infinite recursion loop.
When translated to Haskell
split [] = ([], [])
split [x] = ([x], [])
split (x:y:xs) = (x:s1, y:s2)
where (s1,s2) = split xs
merge xs [] = xs
merge [] ys = ys
merge xx@(x:xs) yy@(y:ys)
| x < y = x : merge xs yy
| otherwise = y : merge xx ys
mergesort [] = []
mergesort xs = merge (mergesort p) (mergesort q)
where (p,q) = split xs
GHC infers a similar type:
*Main> :t mergesort
mergesort :: (Ord a) => [t] -> [a]
How does the Damas–Hindley–Milner algorithm infer this type?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这确实是一个了不起的例子;本质上,在编译时检测到无限循环!在这个例子中,Hindley-Milner 推论没有什么特别之处;它就像往常一样进行。
请注意,ghc 正确获取
split
和merge
的类型:现在,当谈到
mergesort
时,它通常是一个函数 t< sub>1→t2 对于某些类型 t1 和 t2。然后它看到第一行:并意识到 t1 和 t2 必须是列表类型,比如 t1=[t 3] 和 t2=[t4]。所以归并排序必须是一个函数[t3]→[t4]。下一行
告诉它:
p
和q
也是 [t3] 类型,因为split
是 [a]→([ a],[a])mergesort p
,因此,(回想一下,mergesort 被认为是 [t3]→[t4 类型>]) 的类型为 [t4]。mergesort q
的类型为 [t4]。merge
的类型为(Ord t) =>; [t]-> [t]-> [t]
,并且表达式merge (mergesort p) (mergesort q)
中的输入均为 [t4] 类型,即类型 t< sub>4 必须位于Ord
中。merge (mergesort p) (mergesort q)
的类型与其两个输入相同,即 [t4]。这符合之前已知的mergesort
类型 [t3]→[t4],因此无需进行更多推断Hindley-Milner 算法的“统一”部分已经完成。mergesort
的类型为 [t3]→[t4],其中 t4 在Ord< /代码>。
这就是为什么你会得到:(
上面的逻辑推理描述等同于算法所做的事情,但是算法遵循的具体步骤顺序只是维基百科页面上给出的,例如。)
This is indeed a remarkable example; an infinite loop being detected, essentially, at compile time! There is nothing special about the Hindley–Milner inference in this example; it just proceeds as usual.
Note that ghc gets the types of
split
andmerge
correctly:Now when it comes to
mergesort
, it is, in general, a function t1→t2 for some types t1 and t2. Then it sees the first line:and realizes that t1 and t2 must be list types, say t1=[t3] and t2=[t4]. So mergesort must be a function [t3]→[t4]. The next line
tells it that:
p
andq
are also of type [t3], sincesplit
is [a]→([a],[a])mergesort p
, therefore, (recall that mergesort is believed to be of type [t3]→[t4]) is of type [t4].mergesort q
is of type [t4] for exactly the same reason.merge
has type(Ord t) => [t] -> [t] -> [t]
, and the inputs in the expressionmerge (mergesort p) (mergesort q)
are both of type [t4], the type t4 must be inOrd
.merge (mergesort p) (mergesort q)
is the same as both its inputs, namely [t4]. This fits with the previously known type [t3]→[t4] formergesort
, so there are no more inferences to be done and the "unification" part of the Hindley–Milner algorithm is complete.mergesort
is of type [t3]→[t4] with t4 inOrd
.That's why you get:
(The description above in terms of logical inference is equivalent to what the algorithm does, but the specific sequence of steps the algorithm follows is simply that given on the Wikipedia page, for example.)
可以推断出该类型,因为它看到您将
mergesort
的结果传递给merge
,后者又将列表的头部与<
进行比较>,它是 Ord 类型类的一部分。因此类型推断可以推断它必须返回 Ord 实例的列表。当然,由于它实际上无限递归,因此我们无法推断出有关它实际未返回的类型的任何其他信息。That type can be inferred because it sees that you pass the result of
mergesort
tomerge
, which in turn compares the heads of the lists with<
, which is part of the Ord typeclass. So the type inference can reason that it must return a list of an instance of Ord. Of course, since it actually recurses infinitely, we can't infer anything else about the type it doesn't actually return.