yacc/bison LALR(1) 算法如何处理“空”规则?
在 LALR(1) 解析器中,语法中的规则被转换为解析表,该表有效地表示“如果到目前为止有此输入,并且先行标记为 X,则转移到状态 Y,或按规则 R 减少” 。
我已经成功地用解释语言(ruby)构建了一个 LALR(1) 解析器,没有使用生成器,而是在运行时计算解析表并使用该解析表评估输入。这工作得非常好,并且表格生成非常简单(这让我有些惊讶),支持自引用规则和左/右关联。
然而,我难以理解的一件事是 yacc/bison 如何从概念上处理空规则定义。我的解析器无法处理它们,因为在生成表时,它会递归地查看每个规则中的每个符号,而“空”并不是来自词法分析器的东西,也不会被规则减少。那么,LALR(1) 解析器如何处理空规则呢?他们是否特别对待它,或者它是一个“自然”概念,有效的算法应该只使用它,甚至不需要对这样的概念有特别的认识?
比方说,一个规则可以匹配任意数量的成对括号,中间没有任何内容:
expr: /* empty */
| '(' expr ')'
;
像下面这样的输入将匹配此规则:
((((()))))
这意味着在先行标记中读取“(”并看到“)”时,解析器会选择:
- 移动“)”(不可能)
- 根据其他规则减少输入(不可能)
- 其他东西......
不太适合“shift”或“reduce”的核心算法。解析器实际上需要将任何内容移入堆栈,将“无内容”减少为 expr
,然后移动下一个标记 ')'
,给出 '(' expr ' )'
,当然会简化为 expr
,依此类推。
令我困惑的是“什么也不改变”。解析表是如何传达这样的概念的呢?还请考虑,应该可以调用一些语义操作,在减少空值时将值返回到 $$
,因此从解析表中跳过该值并说 是一个相当简单的视图堆栈上的 >'('
和前瞻中的 ')'
应该简单地转换为移位,不会真正生成序列 '(' expr ')',但只会产生序列
'(' ')'
。
In a LALR(1) parser, the rules in the grammar are converted into a parse table that effectively says "If you have this input so far, and the lookahead token is X, then shift to state Y, or reduce by rule R".
I have successfully constructed a LALR(1) parser in an interpreted language (ruby), not using a generator, but computing a parse table at runtime and evaluating the input using that parse table. This works surprisingly well and the table generation is quite trivial (which surprised me somewhat), supporting self-referential rules and left/right association.
One thing I am having some difficulty to understand, however, is how yacc/bison conceptually processes empty rule definitions. My parser can't handle them, since in generating the table it looks at each symbol in each rule, recursively, and "empty" is just not something that will come from the lexer, nor be reduced by a rule. So how then, do LALR(1) parsers process the empty rule? Do they treat it specially, or is it a "natural" concept that a valid algorithm should just work with, without even needing to have particular awareness of such a concept?
Let's say, a rule that can match any number of paired parentheses with nothing in the middle:
expr: /* empty */
| '(' expr ')'
;
Input like the following would match this rule:
((((()))))
This means that upon reading '(' and seeing ')' in the lookahead token, the parser choices:
- Shift the ')' (not possible)
- Reduce the input according to some other rule (not possible)
- Something else...
don't quite fit into the core algorithm of "shift" or "reduce". The parser effectively needs to shift nothing onto the stack, reduce "nothing" to expr
, then shift the next token ')'
, giving '(' expr ')'
, which of course reduces to expr
, and so on.
It's the "shift nothing" that's confusing me. How does the parse table convey such a concept? Consider also that it should be possible to invoke some semantic action that returns a value to $$
on reducing the empty value, so a rather simplistic view of just skipping that from the parse table and saying that '('
on the stack and ')'
in the lookahead should simply translate to a shift, would not genuinely produce the sequence '(' expr ')'
, but would simply produce the sequence '(' ')'
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尽管我想了好几天这个问题,但自从在写问题时和接下来的几分钟里彻底思考这个问题后,我突然发现一些事情是如此明显和简单。
所有规则的归约始终是:从堆栈中弹出 X 个输入,其中 X 是规则中的组件数量,然后将结果移回堆栈并在归约后转到表中给出的任何状态。
在空规则的情况下,你不需要考虑“空”甚至是一个概念。解析表只需要包含一个转换,表示“在堆栈上给出
'('
”,并在前瞻中“任何不是'('
”的内容,减少 '现在,由于空规则的组件大小为零,因此从堆栈中弹出零意味着堆栈不会更改,然后当减少任何内容的结果转移到堆栈上时,您会看到以下内容确实出现在语法中,一切都变得清晰起来。它“正常工作”的原因是因为为了减少空规则,您只需从堆栈中弹出零个项目。
Despite thinking about this for days, since thinking this through when writing the question and in the minutes that followed, something just hit me as so incredibly obvious and simple.
Reduction by all rules is always: pop off X inputs from the stack, where X is the number of components in the rule, then shift the result back onto the stack and goto whatever state is given in the table after that reduction.
In the case of the empty rule, you don't need to consider that "empty" is even a concept. The parse table simply needs to include a transition that says "given
'('
on the stack and 'anything that is not'('
in the lookahead, reduce by the 'empty' rule". Now since the empty rule has a size of zero components, popping zero from the stack means the stack doesn't change, then when the result of reducing nothing is shifted onto the stack, you're looking at something that does indeed appear in the grammar and everything becomes clear.The reason it "just works" is because in order to reduce by the empty rule you only have to pop zero items from the stack.
如果可能的话,另一种观点可能会完善 d11wtq 的出色答案:
在解析器构造过程中,在函数
FOLLOW(X)
和FIRST(X )
。例如,如果您有A -> B x
,并且 B 可以导出 ϵ,那么我们必须将x
包含在由FIRST(A)
计算的集合中。并且也在集合FOLLOW(B)
中。此外,空规则很容易在规范的
LR(1)
项集中表示。一件有用的事情是想象有一个额外的非终结符号
$
代表文件结尾。让我们看一下语法:
对于第一个规范
LR(1)
项集,我们可以采用第一个LR(0)
项集并使用符号“$”添加前瞻:然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
S -> . , 'LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
X -> . id , 'LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
S -> . , 'LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
X -> . id , 'LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。然后我们有一个用于前瞻的
id
:现在让我们看看
FIRST
和FOLLOW
集:这不是“点结尾”项目,所以这里想转移,但前提是集合
FIRST(X)
包含我们的先行符号$
。这是错误的,因此我们不填充表条目。下一步:
这是一个“dot Final”项,因此要减少。为了验证归约的上下文,我们查看
FOLLOW(S)
:我们希望归约的语法符号后面是否可以跟前瞻中的内容?确实,是的。$
始终位于FOLLOW(S)
中,因为根据定义,起始符号后面是输入结束符。所以是的,我们可以减少。由于我们正在减少符号S
,因此减少实际上是一个accept
操作:解析结束。我们用accept
操作填充表条目。类似地,我们可以使用lookahead
id
重复下一个项目集。让我们跳到 S 派生空规则:S
后面可以跟id
吗?几乎没有。所以这种减少是不合适的。我们不填充解析器表条目。所以你可以看到空规则不会造成任何问题。它立即变成点最终
LR(0)
或LR(1)
项(取决于解析器构造方法),并与任何其他点最终项相同地处理关于考虑前瞻和填写表格。Another view to maybe round out d11wtq's great answer, if that is possible:
An nullable rule (one that derives ϵ) is accounted during parser construction under the functions
FOLLOW(X)
andFIRST(X)
. For instance if you haveA -> B x
, and B can derive ϵ, then we have to includex
in the set computed byFIRST(A)
. And also in the setFOLLOW(B)
.Furthermore, empty rules are easily represented in the canonical
LR(1)
item sets.One helpful thing is to imagine that there is an extra nonterminal symbol
$
which represents the end of file.Let's take the grammar:
For the first canonical
LR(1)
item set, we can take the firstLR(0)
item set and add lookahead with the symbol '$':Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
S -> . , 'LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
X -> . id , 'LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
S -> . , 'LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
X -> . id , 'LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.Then we have one for the lookahead being
id
:Now let's look at the
FIRST
andFOLLOW
sets:This is not a "dot final" item, so here want to shift, but only if the set
FIRST(X)
contains our lookahead symbol$
. This is false, so we do not fill the table entry.Next:
This is a "dot final" item, so it wants to reduce. To validate the context for a reduce, we look at
FOLLOW(S)
: can the grammar symbol we wish to reduce be followed by what is in the lookahead? Indeed, yes.$
is always inFOLLOW(S)
because the start symbol is by definition followed by the end of the input. So yes, we can reduce. And since we are reducing the symbolS
, the reduce is actually anaccept
action: the parse ends. We fill the table entry with anaccept
action.Similarly we can repeat for the next item set with lookahead
id
. Let's skip to the S-deriving-empty rule:Can
S
be followed byid
? Hardly. So this reduction is not appropriate. We don't fill the parser table entry.So you can see that an empty rule poses no problem. It immediately turns into a dot final
LR(0)
orLR(1)
item (depending on parser construction method), and is treated the same as any other dot final item with regard to considering the lookahead and filling the tables.