Parlet : 排除条款
我目前正在使用 Ruby 编写一个 Ruby 解析器,更准确地说是 Parslet,因为我认为它比 Treetop 或 Citrus 更容易使用。我使用官方规范创建我的规则,但有一些语句我无法编写,因为它们“排除”了某些语法,而且我不知道该怎么做......好吧,这里有一个示例供您理解。 ..
这是一个基本规则:
foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character
我如何使用 Parslet 翻译它?也许缺席?/在场?东西 ?
非常感谢你,希望有人有想法......祝
你有美好的一天!
编辑: 我尝试了你所说的,所以这是我使用parslet翻译成Ruby语言:
rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}
但是,它似乎不起作用(括号中的序列)。我做了一些测试,得出的结论是我的括号中写的是错误的。
这是一个非常简单的示例,让我们考虑这些规则:
# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }
rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there
我使用以下代码编写这些规则:
# Code to test :
code = "test
"
但我明白:
无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 第 2 行字符 1。
- 无法匹配序列 (SOURCE_CHARACTER{0, } LINE_TERMINATOR) 在第 2 行字符 1。
- 无法匹配序列 (' ' ' '?) 在第 2 行字符 1 处。 `- 第 2 行字符 1 处的输入过早结束。nil
如果这个序列不起作用,我的“完整”规则将永远不起作用......如果有人有一个想法,那就太好了。
谢谢 !
I am currently writting a Ruby parser using Ruby, and more precisely Parslet, since I think it is far more easier to use than Treetop or Citrus. I create my rules using the official specifications, but there are some statements I can not write, since they "exclude" some syntax, and I do not know how to do that... Well, here is an example for you to understand...
Here is a basic rule :
foo::=
any-character+ BUT NOT (foo* escape_character barbar*)
# Knowing that (foo* escape_character barbar*) is included in any-character
How could I translate that using Parslet ? Maybe the absent?/present? stuff ?
Thank you very much, hope someone has an idea....
Have a nice day!
EDIT:
I tried what you said, so here's my translation into Ruby language using parslet:
rule(:line_comment){(source_character.repeat >> line_terminator >> source_character.repeat).absent? >> source_character.repeat(1)}
However, it does not seem to work (the sequence in parens). I did some tests, and came to the conclusion that what's written in my parens is wrong.
Here is a very easier example, let's consider these rules:
# Parslet rules
rule(:source_character) {any}
rule(:line_terminator){ str("\n") >> str("\r").maybe }
rule(:not){source_character.repeat >> line_terminator }
# Which looks like what I try to "detect" up there
I these these rules with this code:
# Code to test :
code = "test
"
But I get that:
Failed to match sequence (SOURCE_CHARACTER{0, } LINE_TERMINATOR) at
line 2 char 1.- Failed to match sequence (SOURCE_CHARACTER{0, }
- Failed to match sequence (' '
LINE_TERMINATOR) at line 2 char 1.
' '?) at line 2 char 1.
`- Premature end of input at line 2 char 1. nil
If this sequence doesn't work, my 'complete' rule up there won't ever work... If anyone has an idea, it would be great.
Thank you !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以执行以下操作:
在这种情况下,
absent?
会进行前瞻以确保下一个标记不是关键字;如果不是,那么它会检查以确保它不是操作员;如果不是,最后看看它是否是一个有效的单词
。等效规则是:
You can do something like this:
In this case, the
absent?
does a lookahead to make sure the next token is not a keyword; if not, then it checks to make sure it's not an operator; if not, finally see if it's a validword
.An equivalent rule would be:
Parlet 匹配本质上是贪婪的。这意味着当您重复诸如
parslet 之类的内容时,它将匹配 foo 直到失败。如果 foo 是,
您将走上失败之路,因为 any.repeat 始终匹配文档的整个其余部分!
您正在寻找的内容类似于示例/string_parser.rb(parslet源树)中的字符串匹配器:
这表示的是:“匹配”,然后匹配反斜杠后跟任何字符,或者匹配任何其他字符,只要它不是终止符“.'”
那么缺席吗?实际上是一种从匹配中排除某些内容的方法:
仅匹配“bar”。如果你明白了这一点,我想你就能解决你的困难。尽管这些不会是您使用 Ruby 解析器的最后一个......
Parslet matching is greedy by nature. This means that when you repeat something like
parslet will match foo until it fails. If foo is
you will be on the path to fail, since any.repeat always matches the entire rest of the document!
What you're looking for is something like the string matcher in examples/string_parser.rb (parslet source tree):
What this says is: 'match ", then match either a backslash followed by any character at all, or match any other character, as long as it is not the terminating ".'
So .absent? is really a way to exclude things from a match that follows:
will only match 'bar'. If you understand that, I assume you will be able to resolve your difficulties. Although those will not be the last on your way to a Ruby parser...