解析中 FIRST 和 FOLLOW 集的用途是什么?

发布于 2024-09-25 11:36:12 字数 202 浏览 6 评论 0原文

什么是 FIRST 和 FOLLOW 集?它们在解析中有何用途? 它们用于自上而下还是自下而上的解析器?

任何人都可以向我解释以下一组语法规则的第一组和后续组:

E := E+T | T
T := T*V | T
V := <id>

What are FIRST and FOLLOW sets? What are they used for in parsing?
Are they used for top-down or bottom-up parsers?

Can anyone explain me FIRST and FOLLOW SETS for the following set of grammar rules:

E := E+T | T
T := T*V | T
V := <id>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

罗罗贝儿 2024-10-02 11:36:12

它们通常用在 LL(自上而下)解析器中,以检查正在运行的解析器是否会遇到有不止一种方法可以继续解析的情况。

如果您有替代方案A | B 并且还有 FIRST(A) = {"a"}FIRST(B) = {"b", "a"} 那么你会存在 FIRST/FIRST 冲突,因为当输入中的下一个出现“a”时,您不知道是扩展 A 还是 B。(假设先行为 1)。

另一方面,如果您有一个可以为空的非终端,例如 AOpt: ("a")? 那么您必须确保 FOLLOW(AOpt) 不包含“a”,否则你不知道是否要扩展 AOpt,如下所示: S: AOpt "a" S 或 AOpt 都可能消耗“a”,这会给我们带来 FIRST/FOLLOW 冲突。

出于性能原因,也可以在解析过程中使用 FIRST 集。如果您有一个可为空的非终结符 NullableNt,您可以扩展它以查看它是否可以消耗任何内容,或者检查 FIRST(NullableNt) 是否包含下一个标记可能会更快,如果不包含,则忽略它(回溯与预测解析)。另一个性能改进是额外为词法扫描器提供当前的 FIRST 集,因此扫描器不会尝试所有可能的终端,而只会尝试上下文当前允许的终端。这与保留的终端冲突,但并不总是需要这些终端。

自下而上的解析器有不同类型的冲突,即Reduce/Reduce 和Shift/Reduce。他们还使用项目集来检测冲突,而不是先,后。

您的语法不适用于 LL 解析器,因为它包含左递归。但是 E、T 和 V 的第一个集合将是 {id} (假设您的 T := T*V | T 应该是 T := T*V | V代码>)。

They are typically used in LL (top-down) parsers to check if the running parser would encounter any situation where there is more than one way to continue parsing.

If you have the alternative A | B and also have FIRST(A) = {"a"} and FIRST(B) = {"b", "a"} then you would have a FIRST/FIRST conflict because when "a" comes next in the input you wouldn't know whether to expand A or B. (Assuming lookahead is 1).

On the other side if you have a Nonterminal that is nullable like AOpt: ("a")? then you have to make sure that FOLLOW(AOpt) doesn't contain "a" because otherwise you wouldn't know if to expand AOpt or not like here: S: AOpt "a" Either S or AOpt could consume "a" which gives us a FIRST/FOLLOW conflict.

FIRST sets can also be used during the parsing process for performance reasons. If you have a nullable nonterminal NullableNt you can expand it in order to see if it can consume anything, or it may be faster to check if FIRST(NullableNt) contains the next token and if not simply ignore it (backtracking vs predictive parsing). Another performance improvement would be to additionally provide the lexical scanner with the current FIRST set, so the scanner does not try all possible terminals but only those that are currently allowed by the context. This conflicts with reserved terminals but those are not always needed.

Bottom up parsers have different kinds of conflicts namely Reduce/Reduce and Shift/Reduce. They also use item sets to detect conflicts and not FIRST,FOLLOW.

Your grammar would't work with LL-parsers because it contains left recursion. But the FIRST sets for E, T and V would be {id} (assuming your T := T*V | T is meant to be T := T*V | V).

迷你仙 2024-10-02 11:36:12

答案:

E->E+T|T

左递归

E->TE'

E'->+TE'|eipsilon

T->T*V|T

左递归

T->VT'

T'-> ;*VT'|epsilon

中没有左递归

V->(id)

因此语法为:

E->TE'

E'->+TE'|epsilon

T->VT'

T'->* VT'|εV-

> (id)

FIRST(E)={(}

FIRST(E')={+,epsilon}

FIRST(T)={(}

FIRST(T')={*,epsilon}

FIRST(V)={(}

开始符号=FOLLOW(E)={$}

E->TE',E'->TE'|epsilon:FOLLOW(E')=FOLLOW(E)={$}

E->TE',E' ->+TE'|epsilon:FOLLOW(T)=FIRST(E')={+,$}

T->VT',T'->*VT'|epsilon:FOLLOW(T')=FOLLOW (T)={+,$}

T->VT',T->*VT'|epsilon:FOLLOW(V)=FIRST(T)={ *,epsilon}

第一组规则

If X is a terminal then First(X) is just X!
If there is a Production X → ε then add ε to first(X)
If there is a Production X → Y1Y2..Yk then add first(Y1Y2..Yk) to first(X)
First(Y1Y2..Yk) is either
    First(Y1) (if First(Y1) doesn't contain ε)
    OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1) except for ε  as well as everything in First(Y2..Yk)
    If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as well.

后续组规则

First put $ (the end of input marker) in Follow(S) (S is the start symbol)
If there is a production A → aBb, (where a can be a whole string) then everything in FIRST(b) except for ε is placed in FOLLOW(B).
If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)
If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B)

Answer :

E->E+T|T

left recursion

E->TE'

E'->+TE'|eipsilon

T->T*V|T

left recursion

T->VT'

T'->*VT'|epsilon

no left recursion in

V->(id)

Therefore the grammar is:

E->TE'

E'->+TE'|epsilon

T->VT'

T'->*VT'|epsilon

V-> (id)

FIRST(E)={(}

FIRST(E')={+,epsilon}

FIRST(T)={(}

FIRST(T')={*,epsilon}

FIRST(V)={(}

Starting Symbol=FOLLOW(E)={$}

E->TE',E'->TE'|epsilon:FOLLOW(E')=FOLLOW(E)={$}

E->TE',E'->+TE'|epsilon:FOLLOW(T)=FIRST(E')={+,$}

T->VT',T'->*VT'|epsilon:FOLLOW(T')=FOLLOW(T)={+,$}

T->VT',T->*VT'|epsilon:FOLLOW(V)=FIRST(T)={ *,epsilon}

Rules for First Sets

If X is a terminal then First(X) is just X!
If there is a Production X → ε then add ε to first(X)
If there is a Production X → Y1Y2..Yk then add first(Y1Y2..Yk) to first(X)
First(Y1Y2..Yk) is either
    First(Y1) (if First(Y1) doesn't contain ε)
    OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1) except for ε  as well as everything in First(Y2..Yk)
    If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as well.

Rules for Follow Sets

First put $ (the end of input marker) in Follow(S) (S is the start symbol)
If there is a production A → aBb, (where a can be a whole string) then everything in FIRST(b) except for ε is placed in FOLLOW(B).
If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)
If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B)
温暖的光 2024-10-02 11:36:12

维基百科是你的朋友。请参阅LL 解析器的讨论和第一个/后续集。

从根本上来说,它们被用作解析器构建的基础,例如,作为解析器生成器的一部分。您还可以使用它们来推理语法的属性,但大多数人没有太多必要这样做。

Wikipedia is your friend. See discussion of LL parsers and first/follow sets.

Fundamentally they are used as the basic for parser construction, e.g., as part of parser generators. You can also use them to reason about properties of grammars, but most people don't have much of a need to do this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文