对于以分号分隔的列表,如何将产生式转换为 LL(1)?

发布于 2024-08-29 19:36:23 字数 1043 浏览 3 评论 0原文

我正在阅读这本关于解析的入门书(顺便说一句,这相当不错),其中一项练习是“为您最喜欢的语言构建一个解析器”。因为我今天不想死,所以我想我可以为一些相对简单的东西做一个解析器,即简化的 CSS。

注意:本书教您如何使用递归下降算法编写 LL(1) 解析器。

因此,作为子练习,我根据我对 CSS 的了解构建语法。但我陷入了无法在 LL(1) 中转换的产生式:

//EBNF
block = "{", declaration, {";", declaration}, [";"], "}"

//BNF
<block> =:: "{" <declaration> "}"
<declaration> =:: <single-declaration> <opt-end> | <single-declaration> ";" <declaration>
<opt-end> =:: "" | ";"

这描述了一个 CSS 块。有效块可以具有以下形式:

{ property : value }
{ property : value; }
{ property : value; property : value }
{ property : value; property : value; }
...

问题在于可选的“;”最后,因为它与{“;”,声明}的起始字符重叠,所以当我的解析器在这种情况下遇到分号时,它不知道该怎么做。

这本书讨论了这个问题,但在其示例中,分号是强制性的,因此可以像这样修改规则:

block = "{", declaration, ";", {declaration, ";"}, "}"

那么,是否有可能使用 LL(1) 解析器来实现我想要做的事情?

I'm reading this introductory book on parsing (which is pretty good btw) and one of the exercice is to "build a parser for your favorite language." Since I don't want to die today, I thought I could do a parser for something relatively simple, ie a simplified CSS.

Note: This book teach you how to write a LL(1) parser using the recursive-descent algorithm.

So, as a sub-exercice, I am building the grammar from what I know of CSS. But I'm stuck on a production that I can't transform in LL(1) :

//EBNF
block = "{", declaration, {";", declaration}, [";"], "}"

//BNF
<block> =:: "{" <declaration> "}"
<declaration> =:: <single-declaration> <opt-end> | <single-declaration> ";" <declaration>
<opt-end> =:: "" | ";"

This describe a CSS block. Valid block can have the form :

{ property : value }
{ property : value; }
{ property : value; property : value }
{ property : value; property : value; }
...

The problem is with the optional ";" at the end, because it overlap with the starting character of {";", declaration}, so when my parser meet a semicolon in this context, it doesn't know what to do.

The book talk about this problem, but in its example, the semicolon is obligatory, so the rule can be modified like this :

block = "{", declaration, ";", {declaration, ";"}, "}"

So, Is it possible to achieve what I'm trying to do using a LL(1) parser?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

终止放荡 2024-09-05 19:36:23

我想我明白了:

//EBNF 
block = "{", decl, "}"
decl = simple-decl, [";", [decl]]
simple-decl = ...

//BNF
<block> =:: "{" <decl> "}"
<decl> =:: <simple-decl> <decl-end>
<decl-end> =:: ";" <decl-tail> | ""
<decl-tail> =:: <decl> | e

这会产生以下代码:

private function parseBlock():void {
    accept(Token.LBRACE);
    parseDecl();
    accept(Token.RBRACE);
}


//Token.IDENTIFIER is the starting token of a declaration
private function parseDecl():void {
    accept(Token.IDENTIFIER);
    if(_currentToken.kind == Token.SEMICOLON){
        accept(Token.SEMICOLON);
        if(_currentToken.kind == Token.IDENTIFIER){
            parseDecl();
        }
    }
}

我是对的吗?

I think I got it:

//EBNF 
block = "{", decl, "}"
decl = simple-decl, [";", [decl]]
simple-decl = ...

//BNF
<block> =:: "{" <decl> "}"
<decl> =:: <simple-decl> <decl-end>
<decl-end> =:: ";" <decl-tail> | ""
<decl-tail> =:: <decl> | e

This produce the following code :

private function parseBlock():void {
    accept(Token.LBRACE);
    parseDecl();
    accept(Token.RBRACE);
}


//Token.IDENTIFIER is the starting token of a declaration
private function parseDecl():void {
    accept(Token.IDENTIFIER);
    if(_currentToken.kind == Token.SEMICOLON){
        accept(Token.SEMICOLON);
        if(_currentToken.kind == Token.IDENTIFIER){
            parseDecl();
        }
    }
}

I'm I right?

柠栀 2024-09-05 19:36:23

如果允许 null 声明,则可以消除歧义:

//EBNF
block = "{", declaration, {";", declaration}, "}"
declaration = "" | ...

//BNF
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: <declaration> | <declaration> ";" <declaration-list>
<declaration> =:: "" | ...

尽管这不是必需的:

// these don't allow empty declarations
block = "{", {declaration, ";"}, end-declaration, "}"
end-declaration = "" | declaration


<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: "" | <declaration> | <declaration> ";" <declaration-list>

要处理 null,请找出 null 之后可能出现的终端,并让解析器识别这些终端。对于递归下降(在 not-quite-java 中):

/*
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: <declaration> | <declaration> ";" <declaration-list>
<declaration> =:: "" | ...
*/
Node block(tokens) {
    terminal(BLOCKBEGIN, tokens);
    Node decls = declList(tokens);
    terminal(BLOCKEND, tokens);
}
Node declList(tokens) {
    Node dcl = decl(tokens);
    if (expect(LINESEP, tokens)) {
        terminal(LINESEP, tokens);
        return new DeclarationList(dcl, declList(tokens));
    } else {
        return new DeclarationList(dcl);
    }
}
Node decl(tokens) {
    if (expect(BLOCKEND, tokens)) {
        return new Declaration();
    }
    ...
}
/*
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: "" | <declaration> | <declaration> ";" <declaration-list>
*/
Node block(tokens) {
    terminal(BLOCKBEGIN, tokens);
    Node decls = declList(tokens);
    terminal(BLOCKEND, tokens);
}
Node declList(tokens) {
    if (expect(BLOCKEND, tokens)) {
        return new DeclarationList();
    }
    Node dcl = decl(tokens);
    if (expect(LINESEP, tokens)) {
        terminal(LINESEP, tokens);
        return new DeclarationList(dcl, declList(tokens));
    } else {
        return new DeclarationList(dcl);
    } 
}

对于自上而下的解析器,该过程更加明确。构造 FOLLOWS 关系时,您可以递归地将 null 替换为生成 null 的非终结符后面的任何内容。

A → B C
B → b
C → D E
D → d | ""
E → e

FOLLOWS(B) ← FIRST(C) = FIRST(D) = {d, ""}
          += - {""} + FOLLOWS(D) = - {""} + FIRST(E) = - {""} + {e}
FOLLOWS(B) = {d, e}

然后您可以正常填写解析表。

If you allow null declarations, you can remove the ambiguity:

//EBNF
block = "{", declaration, {";", declaration}, "}"
declaration = "" | ...

//BNF
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: <declaration> | <declaration> ";" <declaration-list>
<declaration> =:: "" | ...

though this isn't a requirement:

// these don't allow empty declarations
block = "{", {declaration, ";"}, end-declaration, "}"
end-declaration = "" | declaration


<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: "" | <declaration> | <declaration> ";" <declaration-list>

To handle nulls, figure out what terminals may come after the null, and have your parser recognize those. For recursive descent (in not-quite-java):

/*
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: <declaration> | <declaration> ";" <declaration-list>
<declaration> =:: "" | ...
*/
Node block(tokens) {
    terminal(BLOCKBEGIN, tokens);
    Node decls = declList(tokens);
    terminal(BLOCKEND, tokens);
}
Node declList(tokens) {
    Node dcl = decl(tokens);
    if (expect(LINESEP, tokens)) {
        terminal(LINESEP, tokens);
        return new DeclarationList(dcl, declList(tokens));
    } else {
        return new DeclarationList(dcl);
    }
}
Node decl(tokens) {
    if (expect(BLOCKEND, tokens)) {
        return new Declaration();
    }
    ...
}
/*
<block> =:: "{" <declaration-list> "}"
<declaration-list> =:: "" | <declaration> | <declaration> ";" <declaration-list>
*/
Node block(tokens) {
    terminal(BLOCKBEGIN, tokens);
    Node decls = declList(tokens);
    terminal(BLOCKEND, tokens);
}
Node declList(tokens) {
    if (expect(BLOCKEND, tokens)) {
        return new DeclarationList();
    }
    Node dcl = decl(tokens);
    if (expect(LINESEP, tokens)) {
        terminal(LINESEP, tokens);
        return new DeclarationList(dcl, declList(tokens));
    } else {
        return new DeclarationList(dcl);
    } 
}

For a top-down parser, the process is a little more explicit. When constructing the FOLLOWS relation, you recursively replace nulls with whatever follows the non-terminal that generated the null.

A → B C
B → b
C → D E
D → d | ""
E → e

FOLLOWS(B) ← FIRST(C) = FIRST(D) = {d, ""}
          += - {""} + FOLLOWS(D) = - {""} + FIRST(E) = - {""} + {e}
FOLLOWS(B) = {d, e}

You then fill out the parsing table as normal.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文