解析末尾带有可选数据的文本
请注意,在发布这个问题后,我自己设法找到了解决方案。请参阅这个问题的结尾以获得我的最终答案。
我目前正在为 org-mode 文档开发一个小解析器,在这些文档中标题可以有一个标题,并且可以选择在标题处包含标签列表:
* Heading :foo:bar:baz:
但是,我很难为此编写解析器。以下是我现在正在处理的内容:
import Control.Applicative
import Text.ParserCombinators.Parsec
data Node = Node String [String]
deriving (Show)
myTest = parse node "" "Some text here :tags:here:"
node = Node <$> (many1 anyChar) <*> tags
tags = (char ':') >> (sepEndBy1 (many1 alphaNum) (char ':'))
<?> "Tag list"
虽然我的简单 tags
解析器可以工作,但它在 node
上下文中不起作用,因为使用了所有字符解析标题的标题 (many1 anyChar
)。此外,我无法更改此解析器以使用 noneOf ":"
因为 :
在标题中有效。事实上,只有当它位于标签列表中、位于行的最后时才是特殊的。
我有什么想法可以解析这个可选数据吗?
顺便说一句,这是我的第一个真正的 Haskell 项目,所以如果 Parsec 甚至不是适合这项工作的工具 - 请随时指出这一点并建议其他选择!
好的,我现在有了完整的解决方案,但需要重构。以下作品:
import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Data.Char (isSpace)
import Text.ParserCombinators.Parsec
data Node = Node { level :: Int, keyword :: Maybe String, heading :: String, tags :: Maybe [String] }
deriving (Show)
parseNode = Node <$> level <*> (optionMaybe keyword) <*> name <*> (optionMaybe tags)
where level = length <$> many1 (char '*') <* space
keyword = (try (many1 upper <* space))
name = noneOf "\n" `manyTill` (eof <|> (lookAhead (try (tags *> eof))))
tags = char ':' *> many1 alphaNum `sepEndBy1` char ':'
myTest = parse parseNode "org-mode" "** Some : text here :tags: JUST KIDDING :tags:here:"
myTest2 = parse parseNode "org-mode" "* TODO Just a node"
Please note, subsequently to posting this question I managed to derive a solution myself. See the end of this question for my final answer.
I'm working on a little parser at the moment for org-mode documents, and in these documents headings can have a title, and may optionally consist of a list of tags at the of the heading:
* Heading :foo:bar:baz:
I'm having difficulty writing a parser for this, however. The following is what I'm working with for now:
import Control.Applicative
import Text.ParserCombinators.Parsec
data Node = Node String [String]
deriving (Show)
myTest = parse node "" "Some text here :tags:here:"
node = Node <gt; (many1 anyChar) <*> tags
tags = (char ':') >> (sepEndBy1 (many1 alphaNum) (char ':'))
<?> "Tag list"
While my simple tags
parser works, it doesn't work in the context of node
because all of the characters are used up parsing the title of the heading (many1 anyChar
). Furthermore, I can't change this parser to use noneOf ":"
because :
is valid in the title. In fact, it's only special if it's in a taglist, at the very end of the line.
Any ideas how I can parse this optional data?
As an aside, this is my first real Haskell project, so if Parsec is not even the right tool for the job - feel free to point that out and suggest other options!
Ok, I got a complete solution now, but it needs refactoring. The following works:
import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Data.Char (isSpace)
import Text.ParserCombinators.Parsec
data Node = Node { level :: Int, keyword :: Maybe String, heading :: String, tags :: Maybe [String] }
deriving (Show)
parseNode = Node <gt; level <*> (optionMaybe keyword) <*> name <*> (optionMaybe tags)
where level = length <gt; many1 (char '*') <* space
keyword = (try (many1 upper <* space))
name = noneOf "\n" `manyTill` (eof <|> (lookAhead (try (tags *> eof))))
tags = char ':' *> many1 alphaNum `sepEndBy1` char ':'
myTest = parse parseNode "org-mode" "** Some : text here :tags: JUST KIDDING :tags:here:"
myTest2 = parse parseNode "org-mode" "* TODO Just a node"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
结果:
Results: