HXT：左因子非确定性箭头？

发布于 2024-10-03 09:53:27 字数 1985 浏览 6 评论 0原文

我正在尝试使用 Haskell 的 XML Toolbox (HXT)，并且遇到了墙在某处，因为我似乎没有完全掌握箭头作为计算工具。

这是我的问题，我希望使用 GHCi 会话更好地说明这个问题：

> let parse p = runLA (xread >>> p) "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> :t parse
parse :: LA XmlTree b -> [b]

所以 Parse 是一个小辅助函数，它将我给它的任何箭头应用于简单的 XML 文档

<root>
  <a>foo</a>
  <b>bar</b>
  <c>baz</c>
</root>

我定义了另一个辅助函数，这次是提取下面的文本具有给定名称的节点：

> let extract s = getChildren >>> isElem >>> hasName s >>> getChildren >>> getText 
> :t extract
extract :: (ArrowXml cat) =>
   String -> cat (Data.Tree.NTree.TypeDefs.NTree XNode) String
> parse (extract "a" &&& extract "b") -- extract two nodes' content.
[("foo","bar")]

借助此函数，可以轻松使用 &&& 组合器将两个不同节点的文本配对，然后将其传递给一个构造函数，如下所示：

> parse (extract "a" &&& extract "b" >>^ arr (\(a,b) -> (b,a))) 
[("bar","foo")]

现在是我不明白的部分：我想要左因子！ extract 在根节点上调用 getChildren 两次。相反，我希望它只调用一次！因此，我首先获取根节点的子节点

> let extract' s = hasName s >>> getChildren >>> getText
> :t extract'
extract' :: (ArrowXml cat) => String -> cat XmlTree String
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[]

，请注意，我尝试重新排序对 isElem 等的调用，以查明这是否是问题所在。但就目前情况而言，我只是不知道为什么这不起作用。 Haskell wiki 上有一个箭头“教程”，按照我的理解，它应该可以做我想做的事情——即使用&&&来配对两个计算的结果。

它也确实有效——但仅限于箭头链的开始，而不是中途波谷，当我已经有了一些结果时，我想保持“共享”。我有一种感觉，我无法理解正常函数组合和箭头表示法之间的想法差异。我将非常感谢任何指点！（即使只是一些比 Haskell-wiki 上的更深入的通用箭头教程。）

谢谢！

原文

I'm trying to come to terms with Haskell's XML Toolbox (HXT) and I'm hitting a wall somewhere, because I don't seem to fully grasp arrows as a computational tool.

Here's my problem, which I hoped to illustrate a little better using a GHCi session:

> let parse p = runLA (xread >>> p) "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> :t parse
parse :: LA XmlTree b -> [b]

So Parse is a small helper function that applies whatever arrow I give it to the trivial XML document

<root>
  <a>foo</a>
  <b>bar</b>
  <c>baz</c>
</root>

I define another helper function, this time to extract the text below a node with a given name:

> let extract s = getChildren >>> isElem >>> hasName s >>> getChildren >>> getText 
> :t extract
extract :: (ArrowXml cat) =>
   String -> cat (Data.Tree.NTree.TypeDefs.NTree XNode) String
> parse (extract "a" &&& extract "b") -- extract two nodes' content.
[("foo","bar")]

With the help of this function, it's easy to use the &&& combinator to pair up the text of two different nodes, and then, say, pass it to a constructor, like this:

> parse (extract "a" &&& extract "b" >>^ arr (\(a,b) -> (b,a))) 
[("bar","foo")]

Now comes the part I don't understand: I want to left-factor! extract calls getChildren on the root-node twice. Instead, I'd like it to only call it once! So I first get the child of the root node

> let extract' s = hasName s >>> getChildren >>> getText
> :t extract'
extract' :: (ArrowXml cat) => String -> cat XmlTree String
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[]

Note, that I've tried to re-order the calls to, say, isElem, etc. in order to find out if that's the issue. But as it stands, I just don't have any idea why this isn't working. There is an arrow 'tutorial' on the Haskell wiki and the way I understood it, it should be possible to do what I want to do that way — namely use &&& in order to pair up the results of two computations.

It does work, too — but only at the start of the arrow-chain, not mid-way trough, when I have some results already, that I want to keep 'shared.' I have the feeling that I'm just not being able to wrap my head around a difference in ideas between normal function composition and arrow notation. I'd be very appreciative of any pointers! (Even if it is just to some generic arrow-tutorial that goes a little more in-depth than the on the Haskell-wiki.)

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

负佳期 2024-10-10 09:53:27

如果您将箭头转换为（然后从）确定性版本，这将按预期工作：

> let extract' s = unlistA >>> hasName s >>> getChildren >>> getText
> parse (listA (getChildren >>> isElem) >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

不过，这并不是很令人满意，而且我记不起为什么 (&&& ;) 对于不确定性箭头的行为方式如下（我个人会使用 proc/do 表示法用于任何比这更复杂的事情）。

更新： runLA 和 xread 似乎发生了一些奇怪的事情。如果您使用 runX 和 readString 一切都会按预期工作：

> let xml = "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> let parse p = runX (readString [] xml >>> p)
> let extract' s = getChildren >>> hasName s >>> getChildren >>> getText
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

这意味着您必须在 IO monad 中运行解析器，但是这样做有一些优点无论如何使用runX（更好的错误消息等）。

If you convert the arrow to (and then from) a deterministic version this works as expected:

> let extract' s = unlistA >>> hasName s >>> getChildren >>> getText
> parse (listA (getChildren >>> isElem) >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

This isn't really satisfactory, though, and I can't remember off the top of my head why (&&&) behaves this way with a nondeterministic arrow (I'd personally use the proc/do notation for anything much more complicated than this).

UPDATE: There seems to be something weird going on here with runLA and xread. If you use runX and readString everything works as expected:

> let xml = "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> let parse p = runX (readString [] xml >>> p)
> let extract' s = getChildren >>> hasName s >>> getChildren >>> getText
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

This means you have to run the parser in the IO monad, but there are advantages to using runX anyway (better error messages, etc.).

回复收藏 0 原文

~没有更多了~