Parlet : 排除条款

发布于 2024-12-17 04:33:11 字数 1386 浏览 2 评论 0原文

我目前正在使用 Ruby 编写一个 Ruby 解析器，更准确地说是 Parslet，因为我认为它比 Treetop 或 Citrus 更容易使用。我使用官方规范创建我的规则，但有一些语句我无法编写，因为它们“排除”了某些语法，而且我不知道该怎么做......好吧，这里有一个示例供您理解。 ..

这是一个基本规则：

foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character

我如何使用 Parslet 翻译它？也许缺席？/在场？东西？

非常感谢你，希望有人有想法......祝

你有美好的一天！

编辑：我尝试了你所说的，所以这是我使用parslet翻译成Ruby语言：

  rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}

但是，它似乎不起作用（括号中的序列）。我做了一些测试，得出的结论是我的括号中写的是错误的。

这是一个非常简单的示例，让我们考虑这些规则：

# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }  

rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there

我使用以下代码编写这些规则：

# Code to test : 
code = "test
"

但我明白：

无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 第 2 行字符 1。- 无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 在第 2 行字符 1。- 无法匹配序列 (' ' ' '?) 在第 2 行字符 1 处。 `- 第 2 行字符 1 处的输入过早结束。nil

如果这个序列不起作用，我的“完整”规则将永远不起作用......如果有人有一个想法，那就太好了。

谢谢！

原文

I am currently writting a Ruby parser using Ruby, and more precisely Parslet, since I think it is far more easier to use than Treetop or Citrus. I create my rules using the official specifications, but there are some statements I can not write, since they "exclude" some syntax, and I do not know how to do that... Well, here is an example for you to understand...

Here is a basic rule :

foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character

How could I translate that using Parslet ? Maybe the absent?/present? stuff ?

Thank you very much, hope someone has an idea....

Have a nice day!

EDIT:
I tried what you said, so here's my translation into Ruby language using parslet:

  rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}

However, it does not seem to work (the sequence in parens). I did some tests, and came to the conclusion that what's written in my parens is wrong.

Here is a very easier example, let's consider these rules:

# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }  

rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there

I these these rules with this code:

# Code to test : 
code = "test
"

But I get that:

Failed to match sequence (SOURCE_CHARACTER{0, } LINE_TERMINATOR) at
line 2 char 1. - Failed to match sequence (SOURCE_CHARACTER{0, } LINE_TERMINATOR) at line 2 char 1.- Failed to match sequence (' '
' '?) at line 2 char 1.
`- Premature end of input at line 2 char 1. nil

If this sequence doesn't work, my 'complete' rule up there won't ever work... If anyone has an idea, it would be great.

Thank you !

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深爱成瘾 2024-12-24 04:33:11

您可以执行以下操作：

rule(:word) { match['^")(\\s'].repeat(1) } # normal word
rule(:op) { str('AND') | str('OR') | str('NOT') }
rule(:keyword) { str('all:') | str('any:') }
rule(:searchterm) { keyword.absent? >> op.absent? >>  word }

在这种情况下，absent? 会进行前瞻以确保下一个标记不是关键字；如果不是，那么它会检查以确保它不是操作员；如果不是，最后看看它是否是一个有效的单词。

等效规则是：

rule(:searchterm) { (keyword | op).absent? >> word }

You can do something like this:

rule(:word) { match['^")(\\s'].repeat(1) } # normal word
rule(:op) { str('AND') | str('OR') | str('NOT') }
rule(:keyword) { str('all:') | str('any:') }
rule(:searchterm) { keyword.absent? >> op.absent? >>  word }

In this case, the absent? does a lookahead to make sure the next token is not a keyword; if not, then it checks to make sure it's not an operator; if not, finally see if it's a valid word.

An equivalent rule would be:

rule(:searchterm) { (keyword | op).absent? >> word }

回复收藏 0 原文

无远思近则忧 2024-12-24 04:33:11

Parlet 匹配本质上是贪婪的。这意味着当您重复诸如

foo.repeat

parslet 之类的内容时，它将匹配 foo 直到失败。如果 foo 是，

rule(:foo) { any }

您将走上失败之路，因为 any.repeat 始终匹配文档的整个其余部分！

您正在寻找的内容类似于示例/string_parser.rb（parslet源树）中的字符串匹配器：

rule :string do
  str('"') >> 
  (
    (str('\\') >> any) |
    (str('"').absent? >> any)
  ).repeat.as(:string) >> 
  str('"')
end

这表示的是：“匹配”，然后匹配反斜杠后跟任何字符，或者匹配任何其他字符，只要它不是终止符“.'”

那么缺席吗？实际上是一种从匹配中排除某些内容的方法：

str('foo').absent? >> (str('foo') | str('bar'))

仅匹配“bar”。如果你明白了这一点，我想你就能解决你的困难。尽管这些不会是您使用 Ruby 解析器的最后一个......

Parslet matching is greedy by nature. This means that when you repeat something like

foo.repeat

parslet will match foo until it fails. If foo is

rule(:foo) { any }

you will be on the path to fail, since any.repeat always matches the entire rest of the document!

What you're looking for is something like the string matcher in examples/string_parser.rb (parslet source tree):

rule :string do
  str('"') >> 
  (
    (str('\\') >> any) |
    (str('"').absent? >> any)
  ).repeat.as(:string) >> 
  str('"')
end

What this says is: 'match ", then match either a backslash followed by any character at all, or match any other character, as long as it is not the terminating ".'

So .absent? is really a way to exclude things from a match that follows:

str('foo').absent? >> (str('foo') | str('bar'))

will only match 'bar'. If you understand that, I assume you will be able to resolve your difficulties. Although those will not be the last on your way to a Ruby parser...

回复收藏 0 原文

~没有更多了~