使用 Flex 编写可重入词法分析器

发布于 2024-08-29 00:03:24 字数 1226 浏览 7 评论 0原文

我是弯曲的新手。我正在尝试使用 Flex 编写一个简单的可重入词法分析器/扫描器。词法分析器定义如下。我遇到了如下所示的编译错误(yyg 问题):

reentrant.l:

/* Definitions */

digit           [0-9]
letter          [a-zA-Z]
alphanum        [a-zA-Z0-9]
identifier      [a-zA-Z_][a-zA-Z0-9_]+
integer         [0-9]+
natural         [0-9]*[1-9][0-9]*
decimal         ([0-9]+\.|\.[0-9]+|[0-9]+\.[0-9]+)

%{
    #include <stdio.h>

    #define ECHO fwrite(yytext, yyleng, 1, yyout)

    int totalNums = 0;
%}

%option reentrant
%option prefix="simpleit_"

%%

^(.*)\r?\n     printf("%d\t%s", yylineno++, yytext);

%%
/* Routines */

int yywrap(yyscan_t yyscanner)
{
    return 1;
}

int main(int argc, char* argv[])
{
    yyscan_t yyscanner;

    if(argc < 2) {
        printf("Usage: %s fileName\n", argv[0]);
        return -1;
    }

    yyin = fopen(argv[1], "rb");

    yylex(yyscanner);

    return 0;
}

编译错误:

vietlq@mylappie:~/Desktop/parsers/reentrant$ gcc lex.simpleit_.c 
reentrant.l: In function ‘main’:
reentrant.l:44: error: ‘yyg’ undeclared (first use in this function)
reentrant.l:44: error: (Each undeclared identifier is reported only once
reentrant.l:44: error: for each function it appears in.)

I'm newbie to flex. I'm trying to write a simple re-entrant lexer/scanner with flex. The lexer definition goes below. I get stuck with compilation errors as shown below (yyg issue):

reentrant.l:

/* Definitions */

digit           [0-9]
letter          [a-zA-Z]
alphanum        [a-zA-Z0-9]
identifier      [a-zA-Z_][a-zA-Z0-9_]+
integer         [0-9]+
natural         [0-9]*[1-9][0-9]*
decimal         ([0-9]+\.|\.[0-9]+|[0-9]+\.[0-9]+)

%{
    #include <stdio.h>

    #define ECHO fwrite(yytext, yyleng, 1, yyout)

    int totalNums = 0;
%}

%option reentrant
%option prefix="simpleit_"

%%

^(.*)\r?\n     printf("%d\t%s", yylineno++, yytext);

%%
/* Routines */

int yywrap(yyscan_t yyscanner)
{
    return 1;
}

int main(int argc, char* argv[])
{
    yyscan_t yyscanner;

    if(argc < 2) {
        printf("Usage: %s fileName\n", argv[0]);
        return -1;
    }

    yyin = fopen(argv[1], "rb");

    yylex(yyscanner);

    return 0;
}

Compilation errors:

vietlq@mylappie:~/Desktop/parsers/reentrant$ gcc lex.simpleit_.c 
reentrant.l: In function ‘main’:
reentrant.l:44: error: ‘yyg’ undeclared (first use in this function)
reentrant.l:44: error: (Each undeclared identifier is reported only once
reentrant.l:44: error: for each function it appears in.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

忘你却要生生世世 2024-09-05 00:03:24

对于可重入词法分析器,所有通信都必须包含状态,它包含在扫描器中。

在程序中的任何位置(例如main内),您都可以通过将扫描器传递到的特殊函数来访问状态变量。 例如,在您原来的reentrant.l中,您可以这样做:

yyscan_t scanner;
yylex_init(&scanner);
yyset_in(fopen(argv[1], "rb"), scanner);
yylex(scanner);
yylex_destroy(scanner);

我已重命名scanner以避免与yyscanner混淆> 在行动中。与一般的 C 代码相比,您的所有操作都发生在一个名为 yylex 的巨大函数中,该函数通过名称 yyscanner 传递给您的扫描器。因此,yyscanner 可用于您的所有操作。此外,yylex 有一个名为 yyg 的局部变量,它保存整个状态,并且大多数宏方便地引用 yyg

虽然您确实可以通过定义 yyg 来使用 main 中的 yyin 宏,就像您在自己的答案中所做的那样,但不建议这样做。对于可重入词法分析器,宏仅用于操作。

要了解这是如何实现的,您可以随时查看生成的代码:


/* For convenience, these vars
   are macros in the reentrant scanner. */
#define yyin yyg->yyin_r
...

/* Holds the entire state of the reentrant scanner. */
struct yyguts_t
...

#define YY_DECL int yylex (yyscan_t yyscanner)

/** The main scanner function which does all the work.
 */
YY_DECL
{
    struct yyguts_t * yyg = (struct yyguts_t*)yyscanner;
...
}

flex 文档中还有更多关于可重入 选项的信息,其中包括一个干净的编译示例。 (Google“flex 可重入”,并查找 flex.sourceforge 链接。)与 bison 不同,flex 具有一个相当简单的可重入模型。我强烈建议将可重入 flexLemon 解析器,而不是使用 yacc/bison

For a reentrant lexer, all communication must include the state, which is contained within the scanner.

Anywhere in your program (e.g. inside main) you can access the state variables via special functions to which you will pass your scanner. E.g., in your original reentrant.l, you can do this:

yyscan_t scanner;
yylex_init(&scanner);
yyset_in(fopen(argv[1], "rb"), scanner);
yylex(scanner);
yylex_destroy(scanner);

I have renamed scanner to avoid confusion with yyscanner in the actions. In contrast with general C code, all your actions occur within a giant function called yylex, which is passed your scanner by the name yyscanner. Thus, yyscanner is available to all your actions. In addition, yylex has a local variable called yyg that holds the entire state, and most macros conveniently refer to yyg.

While it is true that you can use the yyin macro inside main by defining yyg as you did in your own Answer, that is not recommended. For a reentrant lexer, the macros are meant for actions only.

To see how this is implemented, you can always view the generated code:


/* For convenience, these vars
   are macros in the reentrant scanner. */
#define yyin yyg->yyin_r
...

/* Holds the entire state of the reentrant scanner. */
struct yyguts_t
...

#define YY_DECL int yylex (yyscan_t yyscanner)

/** The main scanner function which does all the work.
 */
YY_DECL
{
    struct yyguts_t * yyg = (struct yyguts_t*)yyscanner;
...
}

There is lots more on the reentrant option in the flex docs, which include a cleanly compiling example. (Google "flex reentrant", and look for the flex.sourceforge link.) Unlike bison, flex has a fairly straight-forward model for reentrancy. I strongly suggest using reentrant flex with Lemon Parser, rather than with yacc/bison.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文