在 Haskell 中解析特定字符串

发布于 2025-01-05 01:09:27 字数 962 浏览 0 评论 0原文

我正在使用秒差距 Haskell 库。

我想解析以下类型的字符串:

[[v1]][[v2]]

xyz[[v1]][[v2]]

[[v1]]xyz[[v2]]

等等。

我很感兴趣只收集值 v1 和 v2,并将它们存储在数据结构中。

我尝试使用以下代码:

import Text.ParserCombinators.Parsec

quantifiedVars = sepEndBy var (string "]]")
var = between (string "[[") (string "") (many (noneOf "]]"))

parseSL :: String -> Either ParseError [String]
parseSL input = parse quantifiedVars "(unknown)" input

main = do {
   c <- getContents;
   case parse quantifiedVars "(stdin)" c of {
      Left e -> do { putStrLn "Error parsing input:"; print e; };
      Right r -> do{ putStrLn "ok"; mapM_ print r; };
   }
}

这样,如果输入是 "[[v1]][[v2]]" 程序工作正常,返回以下输出:

"v1"

"v2"

如果输入是 “xyz[[v1]][[v2]]” 该程序无法运行。特别是,我只想要 [[...]] 中包含的内容,忽略 "xyz"

另外,我想将 [[...]] 的内容存储在数据结构中。

你如何解决这个问题?

I'm using the parsec Haskell library.

I want to parse strings of the following kind:

[[v1]][[v2]]

xyz[[v1]][[v2]]

[[v1]]xyz[[v2]]

etc.

I'm interesting to collect only the values v1 and v2, and store these in a data structure.

I tried with the following code:

import Text.ParserCombinators.Parsec

quantifiedVars = sepEndBy var (string "]]")
var = between (string "[[") (string "") (many (noneOf "]]"))

parseSL :: String -> Either ParseError [String]
parseSL input = parse quantifiedVars "(unknown)" input

main = do {
   c <- getContents;
   case parse quantifiedVars "(stdin)" c of {
      Left e -> do { putStrLn "Error parsing input:"; print e; };
      Right r -> do{ putStrLn "ok"; mapM_ print r; };
   }
}

In this way, if the input is "[[v1]][[v2]]" the program works fine, returning the following output:

"v1"

"v2"

If the input is "xyz[[v1]][[v2]]" the program doesn't work. In particular, I want only what is contained in [[...]], ignoring "xyz".

Also, I want to store the content of [[...]] in a data structure.

How do you solve this problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

半步萧音过轻尘 2025-01-12 01:09:27

您需要重组您的解析器。你在非常奇怪的地方使用组合器,它们会把事情搞砸。

var 是“[[”和“]]”之间的varName。所以,写下:

var = between (string "[[") (string "]]") varName

varName 应该有某种格式(我不认为你想接受“%A¤%&”,是吗?),所以你应该做一个解析器;但如果它真的可以是任何东西,只需这样做:

varName = many $ noneOf "]"

然后,包含变量的文本是由非变量分隔的变量。

varText = someText *> var `sepEndBy` someText

...其中 someText 是除 '[' 之外的任何内容:

someText = many $ noneOf "["

如果您希望它可解析,事情会变得更加复杂:

bla bla [ bla bla [[somevar]blabla]]

那么您需要一个更好的解析器来解析 varNamesomeText

varName = concat <
gt; many (try incompleteTerminator <|> many1 (noneOf "]"))

-- Parses e.g. "]a"
incompleteTerminator = (\ a b -> [a, b]) <
gt; char ']' <*> noneOf "]"

someText = concat <
gt; many (try incompleteInitiator <|> many1 (noneOf "["))

-- Parses e.g. "[b"
incompleteInitiator = (\ a b -> [a, b]) <
gt; char '[' <*> noneOf "["

PS(<*>)(*>)(<$>) 来自 Control.Applicative

You need to restructure your parser. You are using combinators in very strange locations, and they mess things up.

A var is a varName between "[[" and "]]". So, write that:

var = between (string "[[") (string "]]") varName

A varName should have some kind of format (I don't think that you want to accept "%A¤%&", do you?), so you should make a parser for that; but in case it really can be anything, just do this:

varName = many $ noneOf "]"

Then, a text containing vars, is something with vars separated by non-vars.

varText = someText *> var `sepEndBy` someText

... where someText is anything except a '[':

someText = many $ noneOf "["

Things get more complicated if you want this to be parseable:

bla bla [ bla bla [[somevar]blabla]]

Then you need a better parser for varName and someText:

varName = concat <
gt; many (try incompleteTerminator <|> many1 (noneOf "]"))

-- Parses e.g. "]a"
incompleteTerminator = (\ a b -> [a, b]) <
gt; char ']' <*> noneOf "]"

someText = concat <
gt; many (try incompleteInitiator <|> many1 (noneOf "["))

-- Parses e.g. "[b"
incompleteInitiator = (\ a b -> [a, b]) <
gt; char '[' <*> noneOf "["

PS. (<*>), (*>) and (<$>) is from Control.Applicative.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文