使用 PEG 问题制作 BBcode 解析器
我正在使用 PEG(Ruby 的 Citrus 实现)制作 bbcode 解析器,并且我一直在解析这个 [b]sometext[anothertext[/b]
有代码
grammar BBCodeParser
rule document
(open_tag | close_tag | new_line | text)*
end
rule open_tag
("[" tag_name "="? tag_data? "]")
end
rule close_tag
("[/" tag_name "]")
end
rule text
[^\n\[\]]+
end
rule new_line
("\r\n" | "\n")
end
rule tag_name
# [p|br|b|i|u|hr|code|quote|list|url|img|\*|color]
[a-zA-Z\*]+
end
rule tag_data
([^\[\]\n])+
end
end
问题是规则 text 我不知道怎么说,该文本可以包含除 \r、\n、open_tag 或 close_tag 之外的所有内容。 使用此实现,它在示例中失败,因为排除了 [ 和 ] (那是错误的)
所以最后的问题是如何执行规则,该规则可以匹配除 \r、\n 之外的任何内容或 open_tag 或 close_tag 的精确匹配
如果您有另一个解决方案PEG实施,也给那里。我可以切换:)
I am making bbcode parser with PEG (Citrus implementation for Ruby) and I am stuck on parsing this [b]sometext[anothertext[/b]
There is code
grammar BBCodeParser
rule document
(open_tag | close_tag | new_line | text)*
end
rule open_tag
("[" tag_name "="? tag_data? "]")
end
rule close_tag
("[/" tag_name "]")
end
rule text
[^\n\[\]]+
end
rule new_line
("\r\n" | "\n")
end
rule tag_name
# [p|br|b|i|u|hr|code|quote|list|url|img|\*|color]
[a-zA-Z\*]+
end
rule tag_data
([^\[\]\n])+
end
end
Problem is with rule text
I dont know how to say, that text can contain everything except \r, \n, open_tag or close_tag.
With this implementation it fail on example because of exclude of [ and ] (thats wrong)
So finaly question is how to do rule, that can match anything except \r, \n or exact match of open_tag or close_tag
If you have solution for another PEG implementation, give it there too. I can switch :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不久前我也遇到过类似的问题。有一个技巧可以做到这一点:
您需要说匹配
open_tag
,然后是不是结束标记的所有内容,然后是opening_tag
。所以这给出了以下规则I've encountered a similar problem just a while ago. There is a trick to do this:
You need to say match
open_tag
, followed by everything that is not a closing tag and thenclosing_tag
. So this gives the following rule当
[
不是另一个标记的开头时,这将解析任何文本并递归地继续。This would parse any text and continue recursively when the
[
wasn't the beginning of another tag.这
最终导致解析错误
我尝试继续这个想法,结果是
([^\n] (!open_tag | !close_tag) text*)
但它也会失败。它将匹配
"sometext[anothertext[/b]"
查找临时解决方案
((!open_tag | !close_tag | !new_line) .)
它只会一个字母一个字母地查找,但忽略所有打开和关闭标签。这些字母我稍后可以连接在一起:)
This
ends up with Parse Error
I tried to continue with this idea and result was
([^\n] (!open_tag | !close_tag) text*)
But it will fail too. It will match
"sometext[anothertext[/b]"
Find temp solution
((!open_tag | !close_tag | !new_line) .)
It will find just one letter by one letter, but ignore all open and close tags. These letters i can join together later :)