Data.Foldable 用于无序容器
我正在研究一种用于数据库操作的 Haskell-meets-SQL 语言,以及与之配套的通用类型类库,从 Hackage 中抄袭任何有意义的地方。
由于数据库查询优化器的一个重要目标是消除不必要的排序,因此保留实际上需要排序的静态表示非常重要。这让我们为折叠定义一个类型类。
Haskell 的 Data.Foldable 具有:(删除与我的观点无关的默认定义)
class Foldable t where
-- | Combine the elements of a structure using a monoid.
fold :: Monoid m => t m -> m
-- | Map each element of the structure to a monoid,
-- and combine the results.
foldMap :: Monoid m => (a -> m) -> t a -> m
-- | Right-associative fold of a structure.
foldr :: (a -> b -> b) -> b -> t a -> b
-- | Left-associative fold of a structure.
foldl :: (a -> b -> a) -> a -> t b -> a
-- | A variant of 'foldr' that has no base case,
-- and thus may only be applied to non-empty structures.
foldr1 :: (a -> a -> a) -> t a -> a
-- | A variant of 'foldl' that has no base case,
-- and thus may only be applied to non-empty structures.
foldl1 :: (a -> a -> a) -> t a -> a
在我看来,这个类忽略了一个区别,出于实际目的,这个区别并不那么重要大多数 Haskell 应用程序,但对数据库设置更感兴趣。也就是说:所有 Data.Foldable
实例都带有排序。
适用于不对其元素强加排序的容器类型的这个概念的概括名称是什么?
对于 Haskell Data.Set
来说效果很好,因为实现需要一个 Ord
上下文。不过,排序要求是一个实现工件,对于许多有用的类型,所使用的排序可能没有任何域级含义。
对于更一般的集合,fold :: Monoid m => TM-> m
本身的定义基本上是正确的(foldMap
也是如此)。我说主要是因为它的类型包括结合律(通过Monoid
的定义),但不包括所需的交换律。其他变体甚至不存在。
我不想介绍一些不需要的东西。我也不想在无法追踪的地方引入非确定性。我有兴趣构建一种没有 toList :: Set a -> 的语言和库。 [a]
函数随处可见,因为它引入了以下两点之间的二分法:
- 允许人们观察有关集合/关系如何物理存储的实现细节
- 失去对非确定性效果的跟踪
显然两者都是 sortBy : : (a -> a -> 排序) ->设置-> [a]
和 shuffle:: 设置 a -> Data.Random.RVar [a]
很有用,无可争议,并且将被包含在内。事实上,sortBy
有一个更通用的类型,如 sortBy :: (TheUnorderedFoldableClassIAmTryingToName f) => (a->a->排序)->发-> [a]
。
这个想法叫什么?如果我偏离了基地,我在哪里离开了基地路径?
I'm working on a Haskell-meets-SQL language for database manipulations, and on a common type class library to go with it, cribbing from Hackage wherever it makes sense.
Because a significant objective of a database query optimizer is to eliminate unnecessary sorting, it's important to preserve a static representation of where sorting is in fact necessary. Which brings us to defining a typeclass for folds.
Haskell's Data.Foldable
has: (eliding default definitions which aren't relevant to the point I'm making)
class Foldable t where
-- | Combine the elements of a structure using a monoid.
fold :: Monoid m => t m -> m
-- | Map each element of the structure to a monoid,
-- and combine the results.
foldMap :: Monoid m => (a -> m) -> t a -> m
-- | Right-associative fold of a structure.
foldr :: (a -> b -> b) -> b -> t a -> b
-- | Left-associative fold of a structure.
foldl :: (a -> b -> a) -> a -> t b -> a
-- | A variant of 'foldr' that has no base case,
-- and thus may only be applied to non-empty structures.
foldr1 :: (a -> a -> a) -> t a -> a
-- | A variant of 'foldl' that has no base case,
-- and thus may only be applied to non-empty structures.
foldl1 :: (a -> a -> a) -> t a -> a
It seems to me that this class ignores a distinction which is, for practical purposes, not so important to most Haskell applications but of much more interest in a database setting. To wit: all Data.Foldable
instances come with an ordering.
What is the name for the generalization of this concept that applies at container types which don't impose an ordering on their elements?
For Haskell Data.Set
s it works out fine, because there is an Ord
context required by the implementation. The ordering requirement is an implementation artifact though, and for many useful types the ordering being used may not have any domain-level meaning.
For sets more generally the fold :: Monoid m => t m -> m
definition on its own is mostly right (so is foldMap
). I say mostly because its type includes the associativity law (through the definition of Monoid
) but not the required commutativity law. The other variants don't even exist.
I don't want to introduce sorts where they aren't needed. I also don't want to introduce non-determinism where it can't be tracked. I'm interested in building a language and library that doesn't have a toList :: Set a -> [a]
function lying around anywhere, because it introduces a dichotomy between:
- Allowing people to observe implementation details about how a set/relation is physically stored
- Losing track of non-determinism as an effect
Obviously both sortBy :: (a -> a -> Ordering) -> Set a -> [a]
and shuffle :: Set a -> Data.Random.RVar [a]
are useful, unobjectionable, and will be included. In fact, sortBy
has an even more general type as sortBy :: (TheUnorderedFoldableClassIAmTryingToName f) => (a -> a -> Ordering) -> f a -> [a]
.
What is this idea called? If I'm way off base, where did I leave the base path?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
类似折叠运算符执行的操作不会在幺半群上进行操作,而是在可交换半群上进行操作。这给你
op :: (CSemi a) =>;发->一个-> a
在我看过的文献中,运算符/类型类的典型名称就是 CFold——交换折叠的缩写。 (YAHT 还使用 cfold 作为 cps 样式折叠的名称,但我认为这并不常见)
The operation performed by your fold-like operator would not operate over a monoid, but rather, a commutative semigroup. That gives you
op :: (CSemi a) => f a -> a -> a
In the literature I've seen, the typical name for your operator/typeclass would just be CFold -- short for commutative fold. (YAHT also uses cfold as the name for a cps-style fold, but I don't think that's in common usage)