Parsec-Parser 工作正常,但是可以做得更好吗?

发布于 2024-12-11 09:08:15 字数 1474 浏览 1 评论 0原文

我尝试这样做:

解析表单中的文本:

一些文本 #{0,0,0} 一些文本 #{0,0,0}#{0,0,0} 更多文本 #{0,0,0}

放入某些数据结构的列表中:

[“某些文本”内,外部 (0,0,0),“某些文本”内,外部 (0,0,0),外部 (0,0,0),“更多文本”内,外部(0,0,0)]

所以这些 #{a,b,c} 位应该变成与文本其余部分不同的东西。

我有这段代码:

module ParsecTest where

import Text.ParserCombinators.Parsec
import Monad

type Reference = (Int, Int, Int)

data Transc = Inside String | Outside Reference
              deriving (Show)

text :: Parser Transc
text =  do
         x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
         return (Inside x)

transc = reference <|> text

alot :: Parser [Transc]
alot = do
        manyTill transc eof

reference :: Parser Transc
reference = try (do{ char '#';
                  char '{';
                  a <- number;
                char ',';
                b <- number;
                char ',';
                c <- number;
                char '}';
                return (Outside (a,b,c)) })

number :: Parser Int
number = do{ x <- many1 digit;
             return (read x) }

这按预期工作。您可以通过键入以下内容在 ghci 中测试这一点

parseTest alot“一些文本#{0,0,0}一些文本#{0,0,0}#{0,0,0}更多文本#{0,0,0}”

但我认为这不太好。

1)使用 lookAhead 对于我的问题真的有必要吗?

2) return (Inside "") 是一个丑陋的黑客行为吗?

3)通常是否有更简洁/更智能的方法来实现相同的目标?

I try to do this:

Parse a Text in the form:

Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}

into a list of some data structure:

[Inside "Some Text ",Outside (0,0,0),Inside " some Text ",Outside (0,0,0),Outside (0,0,0),Inside " more Text ",Outside (0,0,0)]

So these #{a,b,c}-bits should turn into different things as the rest of the text.

I have this code:

module ParsecTest where

import Text.ParserCombinators.Parsec
import Monad

type Reference = (Int, Int, Int)

data Transc = Inside String | Outside Reference
              deriving (Show)

text :: Parser Transc
text =  do
         x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
         return (Inside x)

transc = reference <|> text

alot :: Parser [Transc]
alot = do
        manyTill transc eof

reference :: Parser Transc
reference = try (do{ char '#';
                  char '{';
                  a <- number;
                char ',';
                b <- number;
                char ',';
                c <- number;
                char '}';
                return (Outside (a,b,c)) })

number :: Parser Int
number = do{ x <- many1 digit;
             return (read x) }

This works as expected. You can test this in ghci by typing

parseTest alot "Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}"

But I think it's not nice.

1) Is the use of lookAhead really necessary for my problem?

2) Is the return (Inside "") an ugly hack?

3) Is there generally a more concise/smarter way to archieve the same?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

坚持沉默 2024-12-18 09:08:15

1)我认为你确实需要lookAhead,因为你需要解析的结果。最好通过使用 Parser (Transc,Maybe Transc) 来指示 Inside 以及可选的以下 Outside 来避免运行该解析器两次。 。如果性能是一个问题,那么这是值得做的。

2)是的。

3) Applicatives

number2 :: Parser Int
number2 = read <
gt; many1 digit

text2 :: Parser Transc
text2 = (Inside .) . (:) 
     <
gt; anyChar 
     <*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof)


reference2 :: Parser Transc
reference2 = ((Outside .) .) . (,,) 
          <
gt; (string "#{" *> number2 <* char ',') 
          <*> number2 
          <*> (char ',' *> number2 <* char '}')

transc2 = reference2 <|> text2

alot2 = many transc2

您可能想要使用 aux xyz = Outside (x,y,z) 等帮助器重写 reference2 的开头。

编辑:更改了 text 以处理不以 Outside 结尾的输入。

1) I think you do need lookAhead as you need the result of that parse. It would be nice to avoid running that parser twice by having a Parser (Transc,Maybe Transc) to indicate an Inside with an optional following Outside. If performance is an issue, then this is worth doing.

2) Yes.

3) Applicatives

number2 :: Parser Int
number2 = read <
gt; many1 digit

text2 :: Parser Transc
text2 = (Inside .) . (:) 
     <
gt; anyChar 
     <*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof)


reference2 :: Parser Transc
reference2 = ((Outside .) .) . (,,) 
          <
gt; (string "#{" *> number2 <* char ',') 
          <*> number2 
          <*> (char ',' *> number2 <* char '}')

transc2 = reference2 <|> text2

alot2 = many transc2

You may want to rewrite the beginning of reference2 using a helper like aux x y z = Outside (x,y,z).

EDIT: Changed text to deal with inputs that don't end with an Outside.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文