解析末尾带有可选数据的文本

发布于 2024-10-17 12:16:23 字数 1822 浏览 1 评论 0原文

请注意,在发布这个问题后,我自己设法找到了解决方案。请参阅这个问题的结尾以获得我的最终答案。


我目前正在为 org-mode 文档开发一个小解析器,在这些文档中标题可以有一个标题,并且可以选择在标题处包含标签列表:

* Heading          :foo:bar:baz:

但是,我很难为此编写解析器。以下是我现在正在处理的内容:

import Control.Applicative
import Text.ParserCombinators.Parsec

data Node = Node String [String]
            deriving (Show)

myTest = parse node "" "Some text here :tags:here:"

node = Node <$> (many1 anyChar) <*> tags

tags = (char ':') >> (sepEndBy1 (many1 alphaNum) (char ':'))
   <?> "Tag list"

虽然我的简单 tags 解析器可以工作,但它在 node 上下文中不起作用,因为使用了所有字符解析标题的标题 (many1 anyChar)。此外,我无法更改此解析器以使用 noneOf ":" 因为 : 在标题中有效。事实上,只有当它位于标签列表中、位于行的最后时才是特殊的。

我有什么想法可以解析这个可选数据吗?

顺便说一句,这是我的第一个真正的 Haskell 项目,所以如果 Parsec 甚至不是适合这项工作的工具 - 请随时指出这一点并建议其他选择!


好的,我现在有了完整的解决方案,但需要重构。以下作品:

import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Data.Char (isSpace)
import Text.ParserCombinators.Parsec

 data Node = Node { level :: Int, keyword :: Maybe String, heading :: String, tags :: Maybe [String] }
   deriving (Show)

parseNode = Node <$> level <*> (optionMaybe keyword) <*> name <*> (optionMaybe tags)
    where level = length <$> many1 (char '*') <* space
          keyword = (try (many1 upper <* space))
          name = noneOf "\n" `manyTill` (eof <|> (lookAhead (try (tags *> eof))))
          tags = char ':' *> many1 alphaNum `sepEndBy1` char ':'

myTest = parse parseNode "org-mode" "** Some : text here :tags: JUST KIDDING     :tags:here:"
myTest2 = parse parseNode "org-mode" "* TODO Just a node"

Please note, subsequently to posting this question I managed to derive a solution myself. See the end of this question for my final answer.


I'm working on a little parser at the moment for org-mode documents, and in these documents headings can have a title, and may optionally consist of a list of tags at the of the heading:

* Heading          :foo:bar:baz:

I'm having difficulty writing a parser for this, however. The following is what I'm working with for now:

import Control.Applicative
import Text.ParserCombinators.Parsec

data Node = Node String [String]
            deriving (Show)

myTest = parse node "" "Some text here :tags:here:"

node = Node <
gt; (many1 anyChar) <*> tags

tags = (char ':') >> (sepEndBy1 (many1 alphaNum) (char ':'))
   <?> "Tag list"

While my simple tags parser works, it doesn't work in the context of node because all of the characters are used up parsing the title of the heading (many1 anyChar). Furthermore, I can't change this parser to use noneOf ":" because : is valid in the title. In fact, it's only special if it's in a taglist, at the very end of the line.

Any ideas how I can parse this optional data?

As an aside, this is my first real Haskell project, so if Parsec is not even the right tool for the job - feel free to point that out and suggest other options!


Ok, I got a complete solution now, but it needs refactoring. The following works:

import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Data.Char (isSpace)
import Text.ParserCombinators.Parsec

 data Node = Node { level :: Int, keyword :: Maybe String, heading :: String, tags :: Maybe [String] }
   deriving (Show)

parseNode = Node <
gt; level <*> (optionMaybe keyword) <*> name <*> (optionMaybe tags)
    where level = length <
gt; many1 (char '*') <* space
          keyword = (try (many1 upper <* space))
          name = noneOf "\n" `manyTill` (eof <|> (lookAhead (try (tags *> eof))))
          tags = char ':' *> many1 alphaNum `sepEndBy1` char ':'

myTest = parse parseNode "org-mode" "** Some : text here :tags: JUST KIDDING     :tags:here:"
myTest2 = parse parseNode "org-mode" "* TODO Just a node"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

遮了一弯 2024-10-24 12:16:23
import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Text.ParserCombinators.Parsec

instance Applicative (GenParser s a) where
  pure = return
  (<*>) = ap

data Node = Node { name :: String, tags :: Maybe [String] }
  deriving (Show)

parseNode = Node <
gt; name <*> tags
  where tags = optionMaybe $ optional (string " :") *> many (noneOf ":\n") `sepEndBy` (char ':')
        name = noneOf "\n" `manyTill` try (string " :" <|> string "\n")

myTest = parse parseNode "" "Some:text here :tags:here:"
myTest2 = parse parseNode "" "Sometext here :tags:here:"

结果:

*Main> myTest
Right (Node {name = "Some:text here", tags = Just ["tags","here",""]})
*Main> myTest2
Right (Node {name = "Sometext here", tags = Just ["tags","here",""]})
import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Text.ParserCombinators.Parsec

instance Applicative (GenParser s a) where
  pure = return
  (<*>) = ap

data Node = Node { name :: String, tags :: Maybe [String] }
  deriving (Show)

parseNode = Node <
gt; name <*> tags
  where tags = optionMaybe $ optional (string " :") *> many (noneOf ":\n") `sepEndBy` (char ':')
        name = noneOf "\n" `manyTill` try (string " :" <|> string "\n")

myTest = parse parseNode "" "Some:text here :tags:here:"
myTest2 = parse parseNode "" "Sometext here :tags:here:"

Results:

*Main> myTest
Right (Node {name = "Some:text here", tags = Just ["tags","here",""]})
*Main> myTest2
Right (Node {name = "Sometext here", tags = Just ["tags","here",""]})
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文