折叠优化
我只是好奇是否有任何(仅限一阶多态)折叠优化。
对于地图来说,有森林砍伐:map g (map f ls) =>; map (g . f) ls
和 rev (map f ls) => rev_map fl ls(在 Ocaml 中更快)。
但折叠是如此强大,似乎无法进行任何优化。
I am just curious if there are any (first order polymorphic only) optimisations with folds.
For maps, there's deforestation: map g (map f ls) => map (g . f) ls
, and rev (map f ls) => rev_map f ls
(faster in Ocaml).
But fold is so powerful, it seems to defy any optimisation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
显而易见的:
您可能对主题为“使用香蕉、镜头、信封和铁丝网进行函数式编程”的经典论文感兴趣。但请注意,它是技术性的并且具有难以理解的符号。
http://citeseerx.ist.psu.edu/viewdoc/summary ?doi=10.1.1.41.125
编辑:我的第一条规则的第一个版本是错误的,感谢 vincent-hugot 的编辑。
The obvious ones:
You may be interested in the classical paper on the topic, "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire". Beware however that it is technical and has impenetrable notation.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125
Edit: my first version of the first rule was wrong, edited thanks to vincent-hugot.
您可以在褶皱上使用砍伐森林。事实上,
map/map
融合是其中的一个特例。诀窍是用特殊的
build
函数替换列表构造:现在,使用
foldr
的标准定义,我们有以下等价:(
实际上这仅在某些情况下成立,但常见情况,请参阅“快捷融合的正确性”)。
如果您使用
build
编写列表生成函数(包括map
),并使用foldr
编写消费者,则上述等式可以删除大多数中间列表。 Haskell 的列表推导式被翻译为build
和foldr
的组合。这种方法的缺点是它无法处理左侧折叠。不过,Stream Fusion 可以很好地处理这个问题。它将列表生成器和转换器表示为流(共归纳数据类型,有点像迭代器)。上面的论文可读性很强,建议看一下。
Gasche 提到的“香蕉”论文更详细地介绍了折叠的种类及其等效项。
最后,还有 Bird 和 Moor 的“编程代数” ,其中提到了诸如将两个折叠合并为一个之类的转换。
You can use deforestation on folds. In fact,
map/map
fusion is a special case of that.The trick is to replace list construction by a special
build
function:Now, using the standard definition of
foldr
We have the following equivalence:
(Actually this is only true under certain, but common, circumstances. For details see "Correctness of short-cut fusion").
If you write your list producing functions (including
map
) usingbuild
and your consumers usingfoldr
, then the above equality can remove most intermediate lists. Haskell's list comprehensions are translated into combinations ofbuild
andfoldr
.The downside of this approach is that it cannot handle left folds. Stream Fusion handles this just fine, though. It expresses list producers and transformers as streams (coinductive datatypes, kind of like iterators). The above paper is very readable, so I recommend taking a look.
The "bananas" paper mentioned by gasche goes into more details about kinds of folds and their equivalences.
Finally, there is Bird and Moor's "Algebra of Programming", which mentions transformations such as combining two folds into one.
如果您有兴趣更深入地了解理论,我建议您阅读有关 catamorphisms 的内容, 变形 和 结构性。虽然围绕它的范畴论似乎有点可怕,但这个概念并不难。
变形是使用递归数据结构并产生某种值的函数。变形是给定一些值(一种种子)产生递归数据结构的函数。特别是,其他答案中提到的
foldr
和build
是在列表上构建变形和变形的函数。但这个概念基本上可以应用于任何递归数据结构,例如不同种类的树等。现在,如果您构建具有变形的递归数据结构,然后使用变形来使用它,您将得到所谓的hylomorphism。在这种情况下,您实际上不需要中间结构。您可以跳过创建它和销毁它。这通常称为毁林。
关于
map
:这个函数很有趣,它既是变形又是变形:map
消耗一个列表并产生一些东西;而且map
也会生成一个列表,消耗一些东西。因此您可以查看两个映射
map f 的组成。 map g
作为变形 (map g
) 与变形 (map f
) 的组合,形成水质。所以您知道可以通过不创建中间列表来优化(砍伐森林)。具体来说:您可以通过两种方式编写
map
,一种使用foldr
,另一种使用build
:以及组合
map f (map g zs)
为mapCata f (mapAna g zs)
,经过一些简化并应用foldr cn (build g) == gc n
结果地图(f.g)
。If you're interested going a bit deeper into theory, I suggest you to read something about catamorphisms, anamorphisms and hylomorphisms. While the category theory surrounding it may seem to be a bit scary, the concept isn't that difficult.
Catamorphisms are functions that consume recursive data structures and produce some kind of a value. Anamorphisms are functions that given some value (a kind of a seed) produce recursive data structures. In particular,
foldr
andbuild
mentioned in the other anwers are functions to build catamorphisms and anamorphisms on lists. But this concept can be applied to basically any recursive data structure, such as different kinds of trees etc.Now if you build a recursive data structure with an anamorphism and then consume it with a catamorphism, you get what is called a hylomorphism. In such a case, you actually don't need the intermediate structure. You can skip creating it and destroying it. This is often called deforestation.
Concerning
map
: This function is interesting that it's both a catamorphism and an anamorphism:map
consumes a list and produces something; but alsomap
produces a list, consuming something.So you can view the composition of two maps
map f . map g
as a composition of an anamorphism (map g
) with a catamorphism (map f
), forming a hylomorphism. So you know can optimize (deforest) by not creating the intermediate list.To be specific: You could write
map
in two ways, one usingfoldr
and the other usingbuild
:and the composition
map f (map g zs)
asmapCata f (mapAna g zs)
, which after some simplifications and applyingfoldr c n (build g) == g c n
results inmap (f . g)
.