Haskell:“groupBy”的令人惊讶的行为
我试图找出库函数 groupBy (来自 Data.List)的行为,它旨在通过作为第一个参数传入的“相等测试”函数对列表的元素进行分组。类型签名表明相等测试只需要具有类型
(a -> a -> Bool)
但是,当我在 GHCi 6.6 中使用 (<) 作为“相等测试”时,结果不是我所期望的:
ghci> groupBy (<) [1, 2, 3, 2, 4, 1, 5, 9]
[[1,2,3,2,4],[1,5,9]]
相反,我期望运行严格递增的数字,就像这样:
[[1,2,3],[2,4],[1,5,9]]
我错过了什么?
I'm trying to figure out the behavior of the library function groupBy (from Data.List), which purports to group elements of a list by an "equality test" function passed in as the first argument. The type signature suggests that the equality test just needs to have type
(a -> a -> Bool)
However, when I use (<) as the "equality test" in GHCi 6.6, the results are not what I expect:
ghci> groupBy (<) [1, 2, 3, 2, 4, 1, 5, 9]
[[1,2,3,2,4],[1,5,9]]
Instead I'd expect runs of strictly increasing numbers, like this:
[[1,2,3],[2,4],[1,5,9]]
What am I missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
查看 groupBy:
现在比较这两个输出:
简而言之,发生的情况是这样的:
groupBy
假设给定函数(第一个参数)测试相等性,因此假设比较函数是自反、传递和对称(参见等价关系)。这里的问题是小于关系不是自反的,也不是对称的。编辑:以下实现仅假设传递性:
Have a look at the ghc implementation of groupBy:
Now compare these two outputs:
In short, what happens is this:
groupBy
assumes that the given function (the first argument) tests for equality, and thus assumes that the comparison function is reflexive, transitive and symmetric (see equivalence relation). The problem here is that the less-than relation is not reflexive, nor symmetric.Edit: The following implementation only assumes transitivity:
事实是“<”不是平等测试。
您可能会期望某些行为,因为您会以不同的方式实现它,但这并不是它所承诺的。
为什么它输出的是合理答案的一个例子是,如果它扫描它,那么
Now 有 3 组相等的元素。因此,它检查它们中的任何一个实际上是否相同:
因为它知道每个组中的所有元素都相等,所以它可以只查看每个组中的第一个元素,1、2 和 1。
1 > > 2?是的!所以它合并了前两组。
1> 1?不!所以剩下最后一组。
现在它比较所有元素是否相等。
...只是,您没有传递给它它所期望的那种功能。
简而言之,当它需要相等性测试时,给它一个相等性测试。
The fact that "<" isn't an equality test.
You might expect some behavior because you'd implement it differently, but that isn't what it promises.
An example of why what it outputs is a reasonable answer is if it sweeps through it, doing
Now has 3 groups of equal elements. So it checks if any of them are in fact the same:
Since it knows all elements in each group is equal, it can just look at the first element in each, 1, 2 and 1.
1 > 2? Yes! So it merges the first two groups.
1 > 1? No! So it leaves the last group be.
And now it's compared all elements for equality.
...only, you didn't pass it the kind of function it expected.
In short, when it wants an equality test, give it an equality test.
问题在于 Haskell 报告中
groupBy
的参考实现将元素与第一个元素进行比较,因此组并不是严格递增的(它们只需全部大于第一个元素)。相反,您想要的是在相邻元素上进行测试的groupBy
版本,例如实现此处。The problem is that the reference implementation of
groupBy
in the Haskell Report compares elements against the first element, so the groups are not strictly increasing (they just have to be all bigger than the first element). What you want instead is a version ofgroupBy
that tests on adjacent elements, like the implementation here.我想指出的是,groupBy 函数还要求在应用之前对列表进行排序。
例如:
出现此行为是因为 groupBy 在其定义中使用了 span。为了获得不依赖于我们以任何特定顺序拥有底层列表的合理行为,我们可以定义一个函数:
I'd just like to point out that the groupBy function also requires your list to be sorted before being applied.
For example:
This behaviour comes about because groupBy is using span in its definition. To get reasonable behaviour which doesn't rely on us having the underlying list in any particular order we can define a function: