Javacc 解析器选项 LOOKAHEAD,Java
我最近开始使用 javacc 来使用语法分析器,其中一个字段是选项一...我有如下代码:
options
{
LOOKAHEAD=1;
}
PARSER_BEGIN(Calculator)
public class Calculator
{
...
}
PARSER_END(Calculator)
LOOKAHEAD 选项到底意味着什么? 谢谢
I've recently started to play around with grammatical analyzers using javacc and one of the fields is the options one...I have a code like the folowing:
options
{
LOOKAHEAD=1;
}
PARSER_BEGIN(Calculator)
public class Calculator
{
...
}
PARSER_END(Calculator)
What exactly does it mean the LOOKAHEAD option?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
JavaCC 创建递归下降解析器。这种类型的解析器通过查看下一个符号来决定选择哪个规则。默认情况下,它只查看下一个符号(lookahead=1)。但是您可以将解析器配置为不仅查看下一个符号,还可以查看接下来的 N 个符号。如果将lookahead设置为2,生成的解析器将查看接下来的两个符号来决定选择哪个规则。这样,您可以更自然地定义语法,但会牺牲性能。前瞻越大,解析器要做的事情就越多。
如果将常规前瞻设置为较大的数字,则解析器对于所有输入都会变慢(对于重要的语法)。如果您想让解析器默认使用lookahead=1,并且仅在特定情况下使用更大的lookahead,则可以在本地使用lookahead。
http://www.engr .mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc4.5
例如,lookahead=1 的解析器无法决定使用哪一个规则(1 或 2)采取,但使用lookahead = 2它可以:
您可以更改语法的定义以获得相同的结果,但使用lookahead = 1:
JavaCC creates recursive descent parsers. This type of parser works by looking at the next symbol to decide which rule to choose. By default it only looks at the next symbol (lookahead=1). But you can configure the parser to look not only at the next, but also the next N symbols. If you set lookahead to 2, the generated parser will look at the next two symbols to decide which rule to choose. This way, you can define your grammar more natural, but at the cost of performance. The bigger the lookahead, the more the parser will have to do.
If you set the general lookahead to a bigger number, your parser will be slower for all inputs (for non trivial grammars). You can use lookahead locally if you want to let the parser with lookahead=1 by default and use a bigger lookahead only in specific situations.
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc4.5
For example, a parser with lookahead=1 can't decide which of the rules (1 or 2) to take, but with lookahead=2 it can:
You can change the definition of the grammar to get the same result but use lookahead=1:
请参阅 http://en.wikipedia.org/wiki/Lookahead#Lookahead_in_parsing
通常,解析器仅查看下一个标记来确定要应用的生成规则。然而,在某些情况下,这不足以做出选择。例如,给定两个产生式规则:
如果下一个标记的类型为
identifier
,则解析器无法确定是否应使用foo
或bar
生产。 JavaCC 然后会给出一个错误,说它需要使用更多的前瞻。将前瞻更改为 2 意味着解析器可以查看接下来的两个标记,在这种情况下足以在产生式之间进行选择。正如 Steve 指出的,这是在 javacc 文档中: https://javacc.org/tutorials/lookahead
See http://en.wikipedia.org/wiki/Lookahead#Lookahead_in_parsing
Normally, the parser only looks at the next token to determine what production rule to apply. However, in some cases that is not enough to make the choice. For example, given two production rules:
If the next token is of type
identifier
then the parser cannot determine if it should use thefoo
orbar
production. JavaCC will then give an error, saying it needs to use more look-ahead. Changing the look-ahead to 2 means the parser is allowed to look at the next two tokens, which in this case is sufficient to choose between the productions.As Steve pointed out, this is in the javacc docs: https://javacc.org/tutorials/lookahead
LOOKAHEAD 值告诉生成的解析器使用多少个未处理的(即未来的)标记来决定要转换到什么状态。在严格约束的语言中,只需要一个先行标记。语言越模糊,需要越多的前瞻标记来确定要进行哪种状态转换。
我认为 javacc(1) 教程中对此进行了介绍。
The LOOKAHEAD value tells the generated parser how many unprocessed (i.e., future) tokens to use to decide what state to transition to. In a tightly-constrained language, only one lookahead token is necessary. The more ambiguous a language is, the more lookahead tokens are needed to determine which state transition to make.
I think this is covered in the javacc(1) tutorial.