使用 PEG 问题制作 BBcode 解析器

发布于 2024-12-05 03:35:03 字数 793 浏览 3 评论 0原文

我正在使用 PEG（Ruby 的 Citrus 实现）制作 bbcode 解析器，并且我一直在解析这个 [b]sometext[anothertext[/b]

有代码

grammar BBCodeParser
  rule document
    (open_tag | close_tag | new_line | text)*
  end
  rule open_tag
    ("[" tag_name "="? tag_data? "]")
  end

  rule close_tag
    ("[/" tag_name "]") 
  end

  rule text
    [^\n\[\]]+
  end

  rule new_line
    ("\r\n" | "\n")
  end

  rule tag_name
    # [p|br|b|i|u|hr|code|quote|list|url|img|\*|color]
    [a-zA-Z\*]+
  end

  rule tag_data
    ([^\[\]\n])+
  end
end

问题是规则 text 我不知道怎么说，该文本可以包含除 \r、\n、open_tag 或 close_tag 之外的所有内容。使用此实现，它在示例中失败，因为排除了 [ 和 ] （那是错误的）

所以最后的问题是如何执行规则，该规则可以匹配除 \r、\n 之外的任何内容或 open_tag 或 close_tag 的精确匹配

如果您有另一个解决方案PEG实施，也给那里。我可以切换:)

原文

I am making bbcode parser with PEG (Citrus implementation for Ruby) and I am stuck on parsing this [b]sometext[anothertext[/b]

There is code

grammar BBCodeParser
  rule document
    (open_tag | close_tag | new_line | text)*
  end
  rule open_tag
    ("[" tag_name "="? tag_data? "]")
  end

  rule close_tag
    ("[/" tag_name "]") 
  end

  rule text
    [^\n\[\]]+
  end

  rule new_line
    ("\r\n" | "\n")
  end

  rule tag_name
    # [p|br|b|i|u|hr|code|quote|list|url|img|\*|color]
    [a-zA-Z\*]+
  end

  rule tag_data
    ([^\[\]\n])+
  end
end

Problem is with rule text I dont know how to say, that text can contain everything except \r, \n, open_tag or close_tag.
With this implementation it fail on example because of exclude of [ and ] (thats wrong)

So finaly question is how to do rule, that can match anything except \r, \n or exact match of open_tag or close_tag

If you have solution for another PEG implementation, give it there too. I can switch :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

氛圍 2024-12-12 03:35:04

不久前我也遇到过类似的问题。有一个技巧可以做到这一点：
您需要说匹配 open_tag，然后是不是结束标记的所有内容，然后是 opening_tag。所以这给出了以下规则

rule tag
  open_tag ((!open_tag | !close_tag | !new_line ) .)+ close_tag
end

I've encountered a similar problem just a while ago. There is a trick to do this:
You need to say match open_tag, followed by everything that is not a closing tag and then closing_tag. So this gives the following rule

rule tag
  open_tag ((!open_tag | !close_tag | !new_line ) .)+ close_tag
end

回复收藏 0 原文

暮年慕年 2024-12-12 03:35:04

当 [ 不是另一个标记的开头时，这将解析任何文本并递归地继续。

rule text
    [^\n\[\]]+ (!open_tag text)?
end

This would parse any text and continue recursively when the [ wasn't the beginning of another tag.

rule text
    [^\n\[\]]+ (!open_tag text)?
end

回复收藏 0 原文

岁吢 2024-12-12 03:35:04

这

rule text
    [^\n\[\]]+ (!open_tag text)?
end

最终导致解析错误

我尝试继续这个想法，结果是 ([^\n] (!open_tag | !close_tag) text*)
但它也会失败。它将匹配 "sometext[anothertext[/b]"

查找临时解决方案
((!open_tag | !close_tag | !new_line) .)
它只会一个字母一个字母地查找，但忽略所有打开和关闭标签。这些字母我稍后可以连接在一起:)

This

rule text
    [^\n\[\]]+ (!open_tag text)?
end

ends up with Parse Error

I tried to continue with this idea and result was ([^\n] (!open_tag | !close_tag) text*)
But it will fail too. It will match "sometext[anothertext[/b]"

Find temp solution
((!open_tag | !close_tag | !new_line) .)
It will find just one letter by one letter, but ignore all open and close tags. These letters i can join together later :)

回复收藏 0 原文

~没有更多了~