解析中 FIRST 和 FOLLOW 集的用途是什么?
什么是 FIRST 和 FOLLOW 集?它们在解析中有何用途? 它们用于自上而下还是自下而上的解析器?
任何人都可以向我解释以下一组语法规则的第一组和后续组:
E := E+T | T
T := T*V | T
V := <id>
What are FIRST and FOLLOW sets? What are they used for in parsing?
Are they used for top-down or bottom-up parsers?
Can anyone explain me FIRST and FOLLOW SETS for the following set of grammar rules:
E := E+T | T
T := T*V | T
V := <id>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
它们通常用在 LL(自上而下)解析器中,以检查正在运行的解析器是否会遇到有不止一种方法可以继续解析的情况。
如果您有替代方案
A | B
并且还有FIRST(A) = {"a"}
和FIRST(B) = {"b", "a"}
那么你会存在 FIRST/FIRST 冲突,因为当输入中的下一个出现“a”时,您不知道是扩展 A 还是 B。(假设先行为 1)。另一方面,如果您有一个可以为空的非终端,例如
AOpt: ("a")?
那么您必须确保FOLLOW(AOpt)
不包含“a”,否则你不知道是否要扩展 AOpt,如下所示:S: AOpt "a"
S 或 AOpt 都可能消耗“a”,这会给我们带来 FIRST/FOLLOW 冲突。出于性能原因,也可以在解析过程中使用 FIRST 集。如果您有一个可为空的非终结符 NullableNt,您可以扩展它以查看它是否可以消耗任何内容,或者检查
FIRST(NullableNt)
是否包含下一个标记可能会更快,如果不包含,则忽略它(回溯与预测解析)。另一个性能改进是额外为词法扫描器提供当前的 FIRST 集,因此扫描器不会尝试所有可能的终端,而只会尝试上下文当前允许的终端。这与保留的终端冲突,但并不总是需要这些终端。自下而上的解析器有不同类型的冲突,即Reduce/Reduce 和Shift/Reduce。他们还使用项目集来检测冲突,而不是先,后。
您的语法不适用于 LL 解析器,因为它包含左递归。但是 E、T 和 V 的第一个集合将是 {id} (假设您的
T := T*V | T
应该是T := T*V | V
代码>)。They are typically used in LL (top-down) parsers to check if the running parser would encounter any situation where there is more than one way to continue parsing.
If you have the alternative
A | B
and also haveFIRST(A) = {"a"}
andFIRST(B) = {"b", "a"}
then you would have a FIRST/FIRST conflict because when "a" comes next in the input you wouldn't know whether to expand A or B. (Assuming lookahead is 1).On the other side if you have a Nonterminal that is nullable like
AOpt: ("a")?
then you have to make sure thatFOLLOW(AOpt)
doesn't contain "a" because otherwise you wouldn't know if to expand AOpt or not like here:S: AOpt "a"
Either S or AOpt could consume "a" which gives us a FIRST/FOLLOW conflict.FIRST sets can also be used during the parsing process for performance reasons. If you have a nullable nonterminal NullableNt you can expand it in order to see if it can consume anything, or it may be faster to check if
FIRST(NullableNt)
contains the next token and if not simply ignore it (backtracking vs predictive parsing). Another performance improvement would be to additionally provide the lexical scanner with the current FIRST set, so the scanner does not try all possible terminals but only those that are currently allowed by the context. This conflicts with reserved terminals but those are not always needed.Bottom up parsers have different kinds of conflicts namely Reduce/Reduce and Shift/Reduce. They also use item sets to detect conflicts and not FIRST,FOLLOW.
Your grammar would't work with LL-parsers because it contains left recursion. But the FIRST sets for E, T and V would be {id} (assuming your
T := T*V | T
is meant to beT := T*V | V
).答案:
E->E+T|T
左递归
E->TE'
E'->+TE'|eipsilon
T->T*V|T
左递归
T->VT'
T'-> ;*VT'|epsilon
中没有左递归
V->(id)
因此语法为:
E->TE'
E'->+TE'|epsilon
T->VT'
T'->* VT'|εV-
> (id)
FIRST(E)={(}
FIRST(E')={+,epsilon}
FIRST(T)={(}
FIRST(T')={*,epsilon}
FIRST(V)={(}
开始符号=FOLLOW(E)={$}
E->TE',E'->TE'|epsilon:FOLLOW(E')=FOLLOW(E)={$}
E->TE',E' ->+TE'|epsilon:FOLLOW(T)=FIRST(E')={+,$}
T->VT',T'->*VT'|epsilon:FOLLOW(T')=FOLLOW (T)={+,$}
T->VT',T->*VT'|epsilon:FOLLOW(V)=FIRST(T)={ *,epsilon}
第一组规则
后续组规则
Answer :
E->E+T|T
left recursion
E->TE'
E'->+TE'|eipsilon
T->T*V|T
left recursion
T->VT'
T'->*VT'|epsilon
no left recursion in
V->(id)
Therefore the grammar is:
E->TE'
E'->+TE'|epsilon
T->VT'
T'->*VT'|epsilon
V-> (id)
FIRST(E)={(}
FIRST(E')={+,epsilon}
FIRST(T)={(}
FIRST(T')={*,epsilon}
FIRST(V)={(}
Starting Symbol=FOLLOW(E)={$}
E->TE',E'->TE'|epsilon:FOLLOW(E')=FOLLOW(E)={$}
E->TE',E'->+TE'|epsilon:FOLLOW(T)=FIRST(E')={+,$}
T->VT',T'->*VT'|epsilon:FOLLOW(T')=FOLLOW(T)={+,$}
T->VT',T->*VT'|epsilon:FOLLOW(V)=FIRST(T)={ *,epsilon}
Rules for First Sets
Rules for Follow Sets
维基百科是你的朋友。请参阅LL 解析器的讨论和第一个/后续集。
从根本上来说,它们被用作解析器构建的基础,例如,作为解析器生成器的一部分。您还可以使用它们来推理语法的属性,但大多数人没有太多必要这样做。
Wikipedia is your friend. See discussion of LL parsers and first/follow sets.
Fundamentally they are used as the basic for parser construction, e.g., as part of parser generators. You can also use them to reason about properties of grammars, but most people don't have much of a need to do this.