在 Haskell 中解析特定字符串
我正在使用秒差距 Haskell 库。
我想解析以下类型的字符串:
[[v1]][[v2]]
xyz[[v1]][[v2]]
[[v1]]xyz[[v2]]
等等。
我很感兴趣只收集值 v1 和 v2,并将它们存储在数据结构中。
我尝试使用以下代码:
import Text.ParserCombinators.Parsec
quantifiedVars = sepEndBy var (string "]]")
var = between (string "[[") (string "") (many (noneOf "]]"))
parseSL :: String -> Either ParseError [String]
parseSL input = parse quantifiedVars "(unknown)" input
main = do {
c <- getContents;
case parse quantifiedVars "(stdin)" c of {
Left e -> do { putStrLn "Error parsing input:"; print e; };
Right r -> do{ putStrLn "ok"; mapM_ print r; };
}
}
这样,如果输入是 "[[v1]][[v2]]"
程序工作正常,返回以下输出:
"v1"
"v2"
如果输入是 “xyz[[v1]][[v2]]”
该程序无法运行。特别是,我只想要 [[...]]
中包含的内容,忽略 "xyz"
。
另外,我想将 [[...]]
的内容存储在数据结构中。
你如何解决这个问题?
I'm using the parsec Haskell library.
I want to parse strings of the following kind:
[[v1]][[v2]]
xyz[[v1]][[v2]]
[[v1]]xyz[[v2]]
etc.
I'm interesting to collect only the values v1 and v2, and store these in a data structure.
I tried with the following code:
import Text.ParserCombinators.Parsec
quantifiedVars = sepEndBy var (string "]]")
var = between (string "[[") (string "") (many (noneOf "]]"))
parseSL :: String -> Either ParseError [String]
parseSL input = parse quantifiedVars "(unknown)" input
main = do {
c <- getContents;
case parse quantifiedVars "(stdin)" c of {
Left e -> do { putStrLn "Error parsing input:"; print e; };
Right r -> do{ putStrLn "ok"; mapM_ print r; };
}
}
In this way, if the input is "[[v1]][[v2]]"
the program works fine, returning the following output:
"v1"
"v2"
If the input is "xyz[[v1]][[v2]]"
the program doesn't work. In particular, I want only what is contained in [[...]]
, ignoring "xyz"
.
Also, I want to store the content of [[...]]
in a data structure.
How do you solve this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要重组您的解析器。你在非常奇怪的地方使用组合器,它们会把事情搞砸。
var
是“[[”和“]]”之间的varName
。所以,写下:varName
应该有某种格式(我不认为你想接受“%A¤%&”,是吗?),所以你应该做一个解析器;但如果它真的可以是任何东西,只需这样做:然后,包含变量的文本是由非变量分隔的变量。
...其中
someText
是除 '[' 之外的任何内容:如果您希望它可解析,事情会变得更加复杂:
那么您需要一个更好的解析器来解析
varName
和someText
:PS。
(<*>)
、(*>)
和(<$>)
来自Control.Applicative
。You need to restructure your parser. You are using combinators in very strange locations, and they mess things up.
A
var
is avarName
between "[[" and "]]". So, write that:A
varName
should have some kind of format (I don't think that you want to accept "%A¤%&", do you?), so you should make a parser for that; but in case it really can be anything, just do this:Then, a text containing vars, is something with vars separated by non-vars.
... where
someText
is anything except a '[':Things get more complicated if you want this to be parseable:
Then you need a better parser for
varName
andsomeText
:PS.
(<*>)
,(*>)
and(<$>)
is fromControl.Applicative
.