Parlet : 排除条款

发布于 2024-12-17 04:33:11 字数 1386 浏览 2 评论 0原文

我目前正在使用 Ruby 编写一个 Ruby 解析器,更准确地说是 Parslet,因为我认为它比 Treetop 或 Citrus 更容易使用。我使用官方规范创建我的规则,但有一些语句我无法编写,因为它们“排除”了某些语法,而且我不知道该怎么做......好吧,这里有一个示例供您理解。 ..

这是一个基本规则:

foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character

我如何使用 Parslet 翻译它?也许缺席?/在场?东西 ?

非常感谢你,希望有人有想法......祝

你有美好的一天!

编辑: 我尝试了你所说的,所以这是我使用parslet翻译成Ruby语言:

  rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}

但是,它似乎不起作用(括号中的序列)。我做了一些测试,得出的结论是我的括号中写的是错误的。

这是一个非常简单的示例,让我们考虑这些规则:

# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }  

rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there

我使用以下代码编写这些规则:

# Code to test : 
code = "test
"

但我明白:

无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 第 2 行字符 1。- 无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 在第 2 行字符 1。- 无法匹配序列 (' ' ' '?) 在第 2 行字符 1 处。 `- 第 2 行字符 1 处的输入过早结束。nil

如果这个序列不起作用,我的“完整”规则将永远不起作用......如果有人有一个想法,那就太好了。

谢谢 !

I am currently writting a Ruby parser using Ruby, and more precisely Parslet, since I think it is far more easier to use than Treetop or Citrus. I create my rules using the official specifications, but there are some statements I can not write, since they "exclude" some syntax, and I do not know how to do that... Well, here is an example for you to understand...

Here is a basic rule :

foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character

How could I translate that using Parslet ? Maybe the absent?/present? stuff ?

Thank you very much, hope someone has an idea....

Have a nice day!

EDIT:
I tried what you said, so here's my translation into Ruby language using parslet:

  rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}

However, it does not seem to work (the sequence in parens). I did some tests, and came to the conclusion that what's written in my parens is wrong.

Here is a very easier example, let's consider these rules:

# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }  

rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there

I these these rules with this code:

# Code to test : 
code = "test
"

But I get that:

Failed to match sequence (SOURCE_CHARACTER{0, } LINE_TERMINATOR) at
line 2 char 1. - Failed to match sequence (SOURCE_CHARACTER{0, }
LINE_TERMINATOR) at line 2 char 1.
- Failed to match sequence (' '
' '?) at line 2 char 1.
`- Premature end of input at line 2 char 1. nil

If this sequence doesn't work, my 'complete' rule up there won't ever work... If anyone has an idea, it would be great.

Thank you !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

深爱成瘾 2024-12-24 04:33:11

您可以执行以下操作:

rule(:word) { match['^")(\\s'].repeat(1) } # normal word
rule(:op) { str('AND') | str('OR') | str('NOT') }
rule(:keyword) { str('all:') | str('any:') }
rule(:searchterm) { keyword.absent? >> op.absent? >>  word }

在这种情况下,absent? 会进行前瞻以确保下一个标记不是关键字;如果不是,那么它会检查以确保它不是操作员;如果不是,最后看看它是否是一个有效的单词

等效规则是:

rule(:searchterm) { (keyword | op).absent? >> word }

You can do something like this:

rule(:word) { match['^")(\\s'].repeat(1) } # normal word
rule(:op) { str('AND') | str('OR') | str('NOT') }
rule(:keyword) { str('all:') | str('any:') }
rule(:searchterm) { keyword.absent? >> op.absent? >>  word }

In this case, the absent? does a lookahead to make sure the next token is not a keyword; if not, then it checks to make sure it's not an operator; if not, finally see if it's a valid word.

An equivalent rule would be:

rule(:searchterm) { (keyword | op).absent? >> word }
无远思近则忧 2024-12-24 04:33:11

Parlet 匹配本质上是贪婪的。这意味着当您重复诸如

foo.repeat

parslet 之类的内容时,它将匹配 foo 直到失败。如果 foo 是,

rule(:foo) { any }

您将走上失败之路,因为 any.repeat 始终匹配文档的整个其余部分!

您正在寻找的内容类似于示例/string_parser.rb(parslet源树)中的字符串匹配器:

rule :string do
  str('"') >> 
  (
    (str('\\') >> any) |
    (str('"').absent? >> any)
  ).repeat.as(:string) >> 
  str('"')
end

这表示的是:“匹配”,然后匹配反斜杠后跟任何字符,或者匹配任何其他字符,只要它不是终止符“.'”

那么缺席吗?实际上是一种从匹配中排除某些内容的方法:

str('foo').absent? >> (str('foo') | str('bar'))

仅匹配“bar”。如果你明白了这一点,我想你就能解决你的困难。尽管这些不会是您使用 Ruby 解析器的最后一个......

Parslet matching is greedy by nature. This means that when you repeat something like

foo.repeat

parslet will match foo until it fails. If foo is

rule(:foo) { any }

you will be on the path to fail, since any.repeat always matches the entire rest of the document!

What you're looking for is something like the string matcher in examples/string_parser.rb (parslet source tree):

rule :string do
  str('"') >> 
  (
    (str('\\') >> any) |
    (str('"').absent? >> any)
  ).repeat.as(:string) >> 
  str('"')
end

What this says is: 'match ", then match either a backslash followed by any character at all, or match any other character, as long as it is not the terminating ".'

So .absent? is really a way to exclude things from a match that follows:

str('foo').absent? >> (str('foo') | str('bar'))

will only match 'bar'. If you understand that, I assume you will be able to resolve your difficulties. Although those will not be the last on your way to a Ruby parser...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文