什么是“语义谓词”?在ANTLR?
ANTLR 中的语义谓词是什么?
What is a semantic predicate in ANTLR?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
ANTLR 中的语义谓词是什么?
What is a semantic predicate in ANTLR?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
ANTLR 4
对于 ANTLR 4 中的谓词,请查看这些堆栈溢出问答:
ANTLR 3
语义谓词是一种在语法上强制执行额外(语义)规则的方法
使用纯代码的操作。
语义谓词有 3 种类型:
语法示例
假设您有一个仅由数字组成的文本块,数字之间用
逗号,忽略任何空格。你想解析这个输入
确保数字最多为 3 位“长”(最多 999)。下列
语法 (
Numbers.g
) 会做这样的事情:测试
可以使用以下类来测试语法:
通过生成词法分析器和解析器、编译所有
.java
来测试它文件和运行
Main
类:执行此操作时,控制台上不会打印任何内容,这表明没有任何内容
出错了。尝试将:更改
为:
并再次进行测试:您将在控制台上的字符串
777
之后看到错误。语义谓词
这给我们带来了语义谓词。假设你想解析
长度在 1 到 10 位数字之间的数字。像这样的规则
会变得很麻烦。语义谓词可以帮助简化此类规则。
1. 验证语义谓词
验证语义谓词什么都不是
不仅仅是一段代码块后跟一个问号:
要使用验证解决上述问题
语义谓词,将语法中的
number
规则更改为:{ int N = 0; }
和{ N++; }
是纯 Java 语句,其中第一个是在解析器“输入”
number
规则时初始化的。实际的谓词是:
{ N <= 10 }?
,这会导致解析器抛出FailedPredicateException
每当数字长度超过 10 位时。
使用以下
ANTLRStringStream
对其进行测试:它不会产生异常,而以下代码会引发异常:
2. 门控语义谓词
门控语义谓词 类似于 >验证语义谓词,
只有门控版本会产生语法错误,而不是
FailedPredicateException
。门控语义谓词的语法是:
要使用门控谓词来匹配最多 10 位数字的数字来解决上述问题,您可以编写:
使用:
和:
再次测试它,您将看到最后一个会抛出错误。
3. 消除语义谓词歧义 谓词
的最终类型是消除语义谓词歧义,它看起来有点像验证谓词 (
{boolean-expression}?
),但作用更多就像门控语义谓词(当布尔表达式计算结果为false
时,不会引发异常)。您可以在规则的开头使用它来检查规则的某些属性,并让解析器匹配或不匹配该规则。假设示例语法创建了
Number
标记(词法分析器规则而不是解析器规则),它将匹配 0..999 范围内的数字。现在在解析器中,您希望区分低数字和高数字(低:0..500,高:501..999)。这可以使用消除歧义的语义谓词来完成,您可以在其中检查流中的下一个标记 (input.LT(1)
) 以检查它是低还是高。演示:
如果您现在解析字符串
"123, 999, 456, 700, 89, 0"
,您将看到以下输出:ANTLR 4
For predicates in ANTLR 4, checkout these stackoverflow Q&A's:
ANTLR 3
A semantic predicate is a way to enforce extra (semantic) rules upon grammar
actions using plain code.
There are 3 types of semantic predicates:
Example grammar
Let's say you have a block of text consisting of only numbers separated by
comma's, ignoring any white spaces. You would like to parse this input making
sure that the numbers are at most 3 digits "long" (at most 999). The following
grammar (
Numbers.g
) would do such a thing:Testing
The grammar can be tested with the following class:
Test it by generating the lexer and parser, compiling all
.java
files andrunning the
Main
class:When doing so, nothing is printed to the console, which indicates that nothing
went wrong. Try changing:
into:
and do the test again: you will see an error appearing on the console right after the string
777
.Semantic Predicates
This brings us to the semantic predicates. Let's say you want to parse
numbers between 1 and 10 digits long. A rule like:
would become cumbersome. Semantic predicates can help simplify this type of rule.
1. Validating Semantic Predicates
A validating semantic predicate is nothing
more than a block of code followed by a question mark:
To solve the problem above using a validating
semantic predicate, change the
number
rule in the grammar into:The parts
{ int N = 0; }
and{ N++; }
are plain Java statements of whichthe first is initialized when the parser "enters" the
number
rule. The actualpredicate is:
{ N <= 10 }?
, which causes the parser to throw aFailedPredicateException
whenever a number is more than 10 digits long.
Test it by using the following
ANTLRStringStream
:which produces no exception, while the following does thow an exception:
2. Gated Semantic Predicates
A gated semantic predicate is similar to a validating semantic predicate,
only the gated version produces a syntax error instead of a
FailedPredicateException
.The syntax of a gated semantic predicate is:
To instead solve the above problem using gated predicates to match numbers up to 10 digits long you would write:
Test it again with both:
and:
and you will see the last on will throw an error.
3. Disambiguating Semantic Predicates
The final type of predicate is a disambiguating semantic predicate, which looks a bit like a validating predicate (
{boolean-expression}?
), but acts more like a gated semantic predicate (no exception is thrown when the boolean expression evaluates tofalse
). You can use it at the start of a rule to check some property of a rule and let the parser match said rule or not.Let's say the example grammar creates
Number
tokens (a lexer rule instead of a parser rule) that will match numbers in the range of 0..999. Now in the parser, you'd like to make a distinction between low- and hight numbers (low: 0..500, high: 501..999). This could be done using a disambiguating semantic predicate where you inspect the token next in the stream (input.LT(1)
) to check if it's either low or high.A demo:
If you now parse the string
"123, 999, 456, 700, 89, 0"
, you'd see the following output:我一直使用 wincent.com 上 ANTLR 谓词 的简洁参考作为我的指南。
I've always used the terse reference to ANTLR predicates on wincent.com as my guide.