Haskell 中的变形和树遍历

发布于 2024-10-07 18:38:13 字数 1480 浏览 0 评论 0原文

我很不耐烦,期待了解变形论 与这个SO问题相关:)

我只练习了Real World Haskell教程的开始部分。所以,也许我现在要求太多了,如果是这样的话,请告诉我我应该学习的概念。

下面,我引用了维基百科的catamorphism代码示例

我想知道您对下面的foldTree的看法,这是一种遍历树的方式,与其他SO问答相比,也处理遍历树n 叉树遍历. (无论是否是二进制,我认为可以编写下面的变形来管理n元树)

我将我理解的内容发表评论,如果你能纠正我并澄清一些事情,我会很高兴。

{-this is a binary tree definition-}
data Tree a = Leaf a
            | Branch (Tree a) (Tree a)

{-I dont understand the structure between{} 
however it defines two morphisms, leaf and branch 
leaf take an a and returns an r, branch takes two r and returns an r-} 
data TreeAlgebra a r = TreeAlgebra { leaf   :: a      -> r
                                   , branch :: r -> r -> r }

{- foldTree is a morphism that takes: a TreeAlgebra for Tree a with result r, a Tree a
and returns an r -} 
foldTree :: TreeAlgebra a r -> Tree a -> r
foldTree a@(TreeAlgebra {leaf   = f}) (Leaf   x  ) = f x
foldTree a@(TreeAlgebra {branch = g}) (Branch l r) = g (foldTree a l) (foldTree a r)

在这一点上我遇到了很多困难,我似乎猜测态射叶 将应用于任何叶子 但为了真正使用此代码,需要向 FoldTree 提供定义的 TreeAlgebra, 一个具有定义的态射叶子以便做某事的树代数?
但在这种情况下,在 FoldTree 代码中,我期望 {f = leaf} 而不是相反,

非常欢迎您的任何澄清。

I am impatient, looking forward to understanding catamorphism related to this SO question :)

I have only practiced the beginning of Real World Haskell tutorial. So, Maybe I'm gonna ask for way too much right now, if it was the case, just tell me the concepts I should learn.

Below, I quote the wikipedia code sample for catamorphism.

I would like to know your opinion about foldTree below, a way of traversing a Tree, compared to this other SO question and answer, also dealing with traversing a Tree n-ary tree traversal. (independantly from being binary or not, I think the catamorphism below can be written so as to manage n-ary tree)

I put in comment what I understand, and be glad if you could correct me, and clarify some things.

{-this is a binary tree definition-}
data Tree a = Leaf a
            | Branch (Tree a) (Tree a)

{-I dont understand the structure between{} 
however it defines two morphisms, leaf and branch 
leaf take an a and returns an r, branch takes two r and returns an r-} 
data TreeAlgebra a r = TreeAlgebra { leaf   :: a      -> r
                                   , branch :: r -> r -> r }

{- foldTree is a morphism that takes: a TreeAlgebra for Tree a with result r, a Tree a
and returns an r -} 
foldTree :: TreeAlgebra a r -> Tree a -> r
foldTree a@(TreeAlgebra {leaf   = f}) (Leaf   x  ) = f x
foldTree a@(TreeAlgebra {branch = g}) (Branch l r) = g (foldTree a l) (foldTree a r)

at this point I am having many difficulties, I seem to guess that the morphism leaf
will be applied to any Leaf
But so as to use this code for real, foldTree needs to be fed a defined TreeAlgebra,
a TreeAlgebra that has a defined morphism leaf so as to do something ?
but in this case in the foldTree code I would expect {f = leaf} and not the contrary

Any clarification from you would be really welcome.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

转身泪倾城 2024-10-14 18:38:13

不太确定你在问什么。但是,是的,您将一个 TreeAlgebra 提供给 foldTree ,对应于您想要在树上执行的计算。例如,要对 Int 树中的所有元素求和,您可以使用以下代数:

sumAlgebra :: TreeAlgebra Int Int
sumAlgebra = TreeAlgebra { leaf = id
                         , branch = (+) }

这意味着,要获得叶子的总和,请应用 id (不执行任何操作) ) 到叶子中的值。要获得分支的总和,请将每个子分支的总和相加。

事实上,我们可以用 (+) 来表示分支,而不是 \xy -> sumTree x + sumTree y 是变形的本质属性。它表示要在某些递归数据结构上计算某个函数 f ,只需为其直接子级提供 f 的值即可。

Haskell 是一种非常独特的语言,因为我们可以抽象地形式化变形论的概念。让我们为树中的单个节点创建一个数据类型,并对其子节点进行参数化:

data TreeNode a child
    = Leaf a
    | Branch child child

看看我们在那里做了什么?我们只是用我们选择的类型替换了递归子项。这样我们就可以在折叠时将子树的和放在那里。

现在来说说真正神奇的事情。我将用pseudohaskell 编写它——用真正的Haskell 编写它是可能的,但是我们必须添加一些注释来帮助类型检查器,这可能会有点令人困惑。我们采用参数化数据类型的“不动点”——即构造一个数据类型T,使得T = TreeNode a T。他们称这个运算符为Mu

type Mu f = f (Mu f)

仔细看这里。 Mu 的参数不是类型,例如 IntFoo ->酒吧。它是一个类型构造函数,例如MaybeTreeNode Int - Mu 的参数本身带有一个参数。 (对类型构造函数进行抽象的可能性是 Haskell 类型系统在表达能力方面真正脱颖而出的原因之一)。

因此,类型 Mu f 被定义为采用 f 并用 Mu f 本身填充其类型参数。我将定义一个同义词来减少一些噪音:

type IntNode = TreeNode Int

扩展 Mu IntNode,我们得到:

Mu IntNode = IntNode (Mu IntNode)
           = Leaf Int | Branch (Mu IntNode) (Mu IntNode)

你知道 Mu IntNode 与你的 Tree 有何等价吗整数?我们刚刚将递归结构拆散,然后使用 Mu 将其重新组合在一起。这给我们带来了一个优势,我们可以同时讨论所有 Mu 类型。这为我们提供了定义变形现象所需的信息。

让我们定义一下:

type IntTree = Mu IntNode

我说过变形的基本属性是为了计算某个函数 f,只需为其直接子函数提供 f 的值即可。让我们将我们尝试计算的事物的类型称为r,以及数据结构nodeIntNode可能是它的实例化) 。因此,要计算特定节点上的 r,我们需要将该节点及其子节点替换为其 r。该计算的类型为node r -> r。因此,变形说,如果我们有这些计算之一,那么我们可以计算整个递归结构的 r (记住递归在这里用 Mu< /code>):

cata :: (node r -> r) -> Mu node -> r

对于我们的示例来说,这看起来像:

cata :: (IntNode r -> r) -> IntTree -> r

重申一下,如果我们可以采用一个带有 r 的节点作为其子节点并计算 r,那么我们可以计算整个树的r

为了实际计算这个,我们需要 node 成为一个 Functor ——也就是说,我们需要能够将任意函数映射到节点的子节点上。

fmap :: (a -> b) -> node a -> node b

对于IntNode 可以直接完成此操作。

fmap f (Leaf x) = Leaf x                  -- has no children, so stays the same
fmap f (Branch l r) = Branch (f l) (f r)  -- apply function to each child

现在,终于,我们可以给出 cata 的定义(Functor node 约束只是说 node 有一个合适的fmap):

cata :: (Functor node) => (node r -> r) -> Mu node -> r
cata f t = f (fmap (cata f) t)

我使用参数名称t作为“tree”的助记符值。这是一个抽象、密集的定义,但实际上非常简单。它表示:对 t 的每个子节点(它们本身就是 Mu 节点< /code>s) 获取节点 r,然后将该结果传递给 f 计算 t 本身的结果。

回到开头,您定义的代数本质上是定义节点 r -> 的一种方式。 r 函数。事实上,给定一个TreeAlgebra,我们可以很容易地得到折叠函数:

foldFunction :: TreeAlgebra a r -> (TreeNode a r -> r)
foldFunction alg (Leaf a) = leaf alg a
foldFunction alg (Branch l r) = branch alg l r

因此,树的变形可以用我们的通用函数来定义,如下:

type Tree a = Mu (TreeNode a)

treeCata :: TreeAlgebra a r -> (Tree a -> r)
treeCata alg = cata (foldFunction alg)

我没时间了。我知道这很快就变得非常抽象,但我希望它至少给你一个新的观点来帮助你的学习。祝你好运!

Not exactly sure what you're asking. But yeah, you feed a TreeAlgebra to foldTree corresponding to the computation you want to perform on the tree. For example, to sum all the elements in a tree of Ints you would use this algebra:

sumAlgebra :: TreeAlgebra Int Int
sumAlgebra = TreeAlgebra { leaf = id
                         , branch = (+) }

Which means, to get the sum of a leaf, apply id (do nothing) to the value in the leaf. To get the sum of a branch, add together the sums of each of the children.

The fact that we can say (+) for branch instead of, say, \x y -> sumTree x + sumTree y is the essential property of the catamorphism. It says that to compute some function f on some recursive data structure it suffices to have the values of f for its immediate children.

Haskell is a pretty unique language in that we can formalize the idea of catamorphism abstractly. Let's make a data type for a single node in your tree, parameterized over its children:

data TreeNode a child
    = Leaf a
    | Branch child child

See what we did there? We just replaced the recursive children with a type of our choosing. This is so that we can put the subtrees' sums there when we are folding.

Now for the really magical thing. I'm going to write this in pseudohaskell -- writing it in real Haskell is possible, but we have to add some annotations to help the typechecker which can be kind of confusing. We take the "fixed point" of a parameterized data type -- that is, constructing a data type T such that T = TreeNode a T. They call this operator Mu.

type Mu f = f (Mu f)

Look carefully here. The argument to Mu isn't a type, like Int or Foo -> Bar. It's a type constructor like Maybe or TreeNode Int -- the argument to Mu itself takes an argument. (The possibility of abstracting over type constructors is one of the things that makes Haskell's type system really stand out in its expressive power).

So the type Mu f is defined as taking f and filling in its type parameter with Mu f itself. I'm going to define a synonym to reduce some of the noise:

type IntNode = TreeNode Int

Expanding Mu IntNode, we get:

Mu IntNode = IntNode (Mu IntNode)
           = Leaf Int | Branch (Mu IntNode) (Mu IntNode)

Do you see how Mu IntNode is equivalent to your Tree Int? We have just torn the recursive structure apart and then used Mu to put it back together again. This gives us the advantage that we can talk about all Mu types at once. This gives us what we need to define a catamorphism.

Let's define:

type IntTree = Mu IntNode

I said the essential property of the catamorphism is that to compute some function f, it suffices to have the values of f for its immediate children. Let's call the type of the thing we are trying to compute r, and the data structure node (IntNode would be a possible instantiation of this). So to compute r on a particular node, we need the node with its children replaced with their rs. This computation has type node r -> r. So a catamorphism says that if we have one of these computations, then we can compute r for the entire recursive structure (remember recursion is denoted explicitly here with Mu):

cata :: (node r -> r) -> Mu node -> r

Making this concrete for our example, this looks like:

cata :: (IntNode r -> r) -> IntTree -> r

Restating, if we can take a node with rs for its children and compute an r, then we can compute an r for an entire tree.

In order to actually compute this, we need node to be a Functor -- that is we need to be able to map an arbitrary function over the children of a node.

fmap :: (a -> b) -> node a -> node b

This can be done straightforwardly for IntNode.

fmap f (Leaf x) = Leaf x                  -- has no children, so stays the same
fmap f (Branch l r) = Branch (f l) (f r)  -- apply function to each child

Now, finally, we can give a definition for cata (the Functor node constraint just says that node has a suitable fmap):

cata :: (Functor node) => (node r -> r) -> Mu node -> r
cata f t = f (fmap (cata f) t)

I used the parameter name t for the mnemonic value of "tree". This is an abstract, dense definition, but it is really very simple. It says: recursively perform cata f -- the computation we are doing over the tree -- on each of t's children (which are themselves Mu nodes) to get a node r, and then pass that result to f compute the result for t itself.

Tying this back to the beginning, the algebra you are defining is essentially a way of defining that node r -> r function. Indeed, given a TreeAlgebra, we can easily get the fold function:

foldFunction :: TreeAlgebra a r -> (TreeNode a r -> r)
foldFunction alg (Leaf a) = leaf alg a
foldFunction alg (Branch l r) = branch alg l r

Thus the tree catamorphism can be defined in terms of our generic one as follows:

type Tree a = Mu (TreeNode a)

treeCata :: TreeAlgebra a r -> (Tree a -> r)
treeCata alg = cata (foldFunction alg)

I'm out of time. I know that got really abstract really fast, but I hope it at least gave you a new viewpoint to help your learning. Good luck!

燃情 2024-10-14 18:38:13

我认为您是在问有关 {} 的问题。有一个较早的问题对 {} 进行了很好的讨论。这些称为 Haskell 的记录语法。另一个问题是为什么要构建代数。这是一个典型的函数范例,您可以将数据概括为函数。

最著名的例子是教会对自然的构建,其中f = + 1 和 z = 0,
0 = z
1 = fz,
2 = f (fz),
3 = f(f(fz)),
等等...

您所看到的本质上与应用于树的想法相同。以教堂为例,树就会点击。

I think you were were asking a question about the {}'s. There is an earlier question with a good discussion of {}'s. Those are called Haskell's record syntax. The other question is why construct the algebra. This is a typical function paradigm where you generalize data as functions.

The most famous example is Church's construction of the Naturals, where f = + 1 and z = 0,
0 = z,
1 = f z,
2 = f (f z),
3 = f (f (f z)),
etc...

What you are seeing is essentially the same idea being applied to a tree. Work the church example and the tree will click.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文