Haskell - 秒差距解析

元素

发布于 2024-08-30 20:30:39 字数 840 浏览 12 评论 0 原文

我正在使用 文本。 ParserCombinators.ParsecText.XHtml 解析这样的输入:

This is the first paragraph example\n
with two lines\n
\n
And this is the second paragraph\n

我的输出应该是:

这是第一段示例\n 有两行\n

这是第二段\n

我定义:


line= do{
        ;t<-manyTill (anyChar) newline
        ;return t
        }

paragraph = do{
        t<-many1 (line) 
        ;return ( p << t )
    }

但它返回:

这是第一段示例\n 有两行\n\n这是第二段\n

出了什么问题?有什么想法吗?

谢谢!

I'm using Text.ParserCombinators.Parsec and Text.XHtml to parse an input like this:

This is the first paragraph example\n
with two lines\n
\n
And this is the second paragraph\n

And my output should be:

<p>This is the first paragraph example\n
with two lines\n</p>
<p>And this is the second paragraph\n</p>

I defined:


line= do{
        ;t<-manyTill (anyChar) newline
        ;return t
        }

paragraph = do{
        t<-many1 (line) 
        ;return ( p << t )
    }

But it returns:

<p>This is the first paragraph example\n
with two lines\n\n And this is the second paragraph\n</p>

What is wrong? Any ideas?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

只是一片海 2024-09-06 20:30:39

来自 文档ManyTill,它会运行第一个参数零次或多次,因此连续 2 个换行符仍然有效,并且您的 line 解析器不会失败。

您可能正在寻找类似 many1Till 的东西(例如 many1many),但它似乎不存在于 Parsec 库中,所以你可能需要自己推出:(警告:我这台机器上没有 ghc,所以这完全未经测试)

many1Till p end = do
    first <- p
    rest  <- p `manyTill` end
    return (first : rest)

或更简洁的方式:

many1Till p end = liftM2 (:) p (p `manyTill` end)

From documentation for manyTill, it runs the first argument zero or more times, so 2 newlines in a row is still valid and your line parser will not fail.

You're probably looking for something like many1Till (like many1 versus many) but it doesn't seem to exist in the Parsec library, so you may need to roll your own: (warning: I don't have ghc on this machine, so this is completely untested)

many1Till p end = do
    first <- p
    rest  <- p `manyTill` end
    return (first : rest)

or a terser way:

many1Till p end = liftM2 (:) p (p `manyTill` end)
吾性傲以野 2024-09-06 20:30:39

manyTill 组合器 匹配第一个参数的零次或多次出现,因此 line 会很乐意接受空行,这意味着 many1 line 将消耗文件中直到最后一个换行符的所有内容,而不是像您预期的那样停在双换行符处。

The manyTill combinator matches zero or more occurrences of its first argument, according to the documentation, so line will happily accept a blank line, which means that many1 line will consume everything up to the final newline in the file, rather than stopping at a double newline as it seems you intended.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文