Haskell 中 .NET 的 GroupBy 函数
.NET框架中的LINQ库确实有一个非常有用的函数,称为GroupBy,我一直在使用它。 它在 Haskell 中的类型如下所示
Ord b => (a-> b) -> [a] -> [(b, [a])]
它的目的是根据给定的分类函数 f
将项目分类到桶中,每个桶包含相似的项目,即 (b, l)
这样对于 l
中的任何项 x
,fx == b
。
它在 .NET 中的性能是 O(N),因为它使用哈希表,但在 Haskell 中我可以接受 O(N*log(N))。
我在标准 Haskell 库中找不到类似的东西。另外,我在标准函数方面的实现有点庞大:
myGroupBy :: Ord k => (a -> k) -> [a] -> [(k, [a])]
myGroupBy f = map toFst
. groupBy ((==) `on` fst)
. sortBy (comparing fst)
. map (\a -> (f a, a))
where
toFst l@((k,_):_) = (k, map snd l)
这绝对不是我想在特定于问题的代码中看到的东西。
我的问题是:如何最大限度地利用标准库来实现这个功能?
此外,似乎缺乏这样的标准函数暗示经验丰富的 Haskellers 可能很少需要它,因为他们可能知道一些更好的方法。这是真的吗?可以用什么更好的方式实现类似的功能?
另外,考虑到 groupBy
已经被采用,它的好名字是什么? :)
LINQ library in .NET framework does have a very useful function called GroupBy, which I have been using all the time.
Its type in Haskell would look like
Ord b => (a-> b) -> [a] -> [(b, [a])]
Its purpose is to classify items based on the given classification function f
into buckets, with each bucket containing similar items, that is (b, l)
such that for any item x
in l
, f x == b
.
Its performance in .NET is O(N) because it uses hash-tables, but in Haskell I am OK with O(N*log(N)).
I can't find anything similar in standard Haskell libraries. Also, my implementation in terms of standard functions is somewhat bulky:
myGroupBy :: Ord k => (a -> k) -> [a] -> [(k, [a])]
myGroupBy f = map toFst
. groupBy ((==) `on` fst)
. sortBy (comparing fst)
. map (\a -> (f a, a))
where
toFst l@((k,_):_) = (k, map snd l)
This is definitely not something I want to see amongst my problem-specific code.
My question is: how can I implement this function nicely exploiting standard libraries to their maximum?
Also, the seeming absence of such a standard function hints that it may rarely be needed by experienced Haskellers because they may know some better way. Is that true? What can be used to implement similar functionality in a better way?
Also, what would be the good name for it, considering groupBy
is already taken? :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
GHC.Exts.groupWith
作为广义列表理解的一部分引入:http://www.haskell.org/ghc/docs/7.0.2/html/users_guide/syntax-extns.html#generalized-list-compressives
GHC.Exts.groupWith
Introduced as part of generalised list comprehensions: http://www.haskell.org/ghc/docs/7.0.2/html/users_guide/syntax-extns.html#generalised-list-comprehensions
使用 Data.Map 作为中间结构:
map 操作将输入列表转换为与包含元素的单例列表配对的键列表。
M.fromListWith (++)
将其转换为Data.Map
,当两个项目具有相同的键时连接,并且M.toList
得到两人再次退出。请注意,这会颠倒列表,因此如有必要,请进行调整。例如,如果您只需要每个组中元素的总和,也可以轻松地将
return
和(++)
替换为其他类似幺半群的操作。Using
Data.Map
as the intermediate structure:The
map
operation turns the input list into a list of keys paired with singleton lists containing the elements.M.fromListWith (++)
turns this into aData.Map
, concatenating when two items have the same key, andM.toList
gets the pairs back out again.Note that this reverses the lists, so adjust for that if necessary. It is also easy to replace
return
and(++)
with other monoid-like operations if you for example only wanted the sum of the elements in each group.