Parsec-Parser 工作正常,但是可以做得更好吗?
我尝试这样做:
解析表单中的文本:
一些文本 #{0,0,0} 一些文本 #{0,0,0}#{0,0,0} 更多文本 #{0,0,0}
放入某些数据结构的列表中:
[“某些文本”内,外部 (0,0,0),“某些文本”内,外部 (0,0,0),外部 (0,0,0),“更多文本”内,外部(0,0,0)]
所以这些 #{a,b,c} 位应该变成与文本其余部分不同的东西。
我有这段代码:
module ParsecTest where
import Text.ParserCombinators.Parsec
import Monad
type Reference = (Int, Int, Int)
data Transc = Inside String | Outside Reference
deriving (Show)
text :: Parser Transc
text = do
x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
return (Inside x)
transc = reference <|> text
alot :: Parser [Transc]
alot = do
manyTill transc eof
reference :: Parser Transc
reference = try (do{ char '#';
char '{';
a <- number;
char ',';
b <- number;
char ',';
c <- number;
char '}';
return (Outside (a,b,c)) })
number :: Parser Int
number = do{ x <- many1 digit;
return (read x) }
这按预期工作。您可以通过键入以下内容在 ghci 中测试这一点
parseTest alot“一些文本#{0,0,0}一些文本#{0,0,0}#{0,0,0}更多文本#{0,0,0}”
但我认为这不太好。
1)使用 lookAhead
对于我的问题真的有必要吗?
2) return (Inside "")
是一个丑陋的黑客行为吗?
3)通常是否有更简洁/更智能的方法来实现相同的目标?
I try to do this:
Parse a Text in the form:
Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}
into a list of some data structure:
[Inside "Some Text ",Outside (0,0,0),Inside " some Text ",Outside (0,0,0),Outside (0,0,0),Inside " more Text ",Outside (0,0,0)]
So these #{a,b,c}-bits should turn into different things as the rest of the text.
I have this code:
module ParsecTest where
import Text.ParserCombinators.Parsec
import Monad
type Reference = (Int, Int, Int)
data Transc = Inside String | Outside Reference
deriving (Show)
text :: Parser Transc
text = do
x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
return (Inside x)
transc = reference <|> text
alot :: Parser [Transc]
alot = do
manyTill transc eof
reference :: Parser Transc
reference = try (do{ char '#';
char '{';
a <- number;
char ',';
b <- number;
char ',';
c <- number;
char '}';
return (Outside (a,b,c)) })
number :: Parser Int
number = do{ x <- many1 digit;
return (read x) }
This works as expected. You can test this in ghci by typing
parseTest alot "Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}"
But I think it's not nice.
1) Is the use of lookAhead
really necessary for my problem?
2) Is the return (Inside "")
an ugly hack?
3) Is there generally a more concise/smarter way to archieve the same?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
1)我认为你确实需要lookAhead,因为你需要解析的结果。最好通过使用
Parser (Transc,Maybe Transc)
来指示Inside
以及可选的以下Outside
来避免运行该解析器两次。 。如果性能是一个问题,那么这是值得做的。2)是的。
3)
Applicative
s您可能想要使用
aux xyz = Outside (x,y,z)
等帮助器重写reference2
的开头。编辑:更改了
text
以处理不以Outside
结尾的输入。1) I think you do need
lookAhead
as you need the result of that parse. It would be nice to avoid running that parser twice by having aParser (Transc,Maybe Transc)
to indicate anInside
with an optional followingOutside
. If performance is an issue, then this is worth doing.2) Yes.
3)
Applicative
sYou may want to rewrite the beginning of
reference2
using a helper likeaux x y z = Outside (x,y,z)
.EDIT: Changed
text
to deal with inputs that don't end with anOutside
.