当前位置：文江博客话题详情

Bison：如果令牌不符合规则，如何忽略它

发布于 2024-11-25 18:20:26 字数 352 浏览 1 评论 0原文

我正在编写一个程序来处理评论以及其他一些事情。如果评论位于特定位置，那么我的程序就会执行某些操作。

Flex 在找到评论时会传递一个令牌，然后 Bison 会查看该令牌是否符合特定规则。如果是，则它将采取与该规则相关的操作。

事情是这样的：我收到的输入实际上可能在错误的地方有评论。在这种情况下，我只想忽略评论而不是标记错误。

我的问题：
如果令牌符合规则，我如何使用它，但如果不符合规则则忽略它？我可以将标记设置为“可选”吗？

（注意：我现在能想到的唯一方法是将评论标记分散在每个可能的规则中的每个可能的位置。必须有比这更好的解决方案也许有一些涉及根的规则？）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

征﹌骨岁月お 2024-12-02 18:20:26

一种解决方案可能是使用 bison 的错误恢复（请参阅 Bison 手册< /a>）。

总而言之，bison 定义了终端标记 error 来表示错误（例如，在错误位置返回的注释标记）。这样，您可以（例如）在找到任性的注释后关闭括号或大括号。然而，这种方法可能会丢弃一定量的解析，因为我不认为 bison 可以“撤消”减少。（“标记”错误，就像将消息打印到 stderr 一样，与此无关：您可以有错误而不打印错误 - 这取决于如何您定义yyerror。）

您可能希望将每个终端包装在一个特殊的非终端中：

term_wrap: comment TERM

这有效地完成了您害怕做的事情（在每个规则中添加注释），但它确实做到了在更少的地方。

为了强迫自己吃自己的狗粮，我为自己编造了一种愚蠢的语言。唯一的语法是 print; please，但如果数字和 please 之间（至少）有一个注释 (##)，则会以十六进制打印数字。

像这样：

print 1 please
1
## print 2 please
2
print ## 3 please
3
print 4 ## please
0x4
print 5 ## ## please
0x5
print 6 please ##
6

我的词法分析器：

%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}

%%

print           return PRINT;
[[:digit:]]+    yylval = atoi(yytext); return NUMBER;
please          return PLEASE;
##              return COMMENT;

[[:space:]]+    /* ignore */
.               /* ditto */

和解析器：

%debug
%error-verbose
%verbose
%locations

%{
#include <stdio.h>
#include <string.h>

void yyerror(const char *str) {
        fprintf(stderr, "error: %s\n", str);
}

int yywrap() {
        return 1;
} 

extern int yydebug;
int main(void) {
    yydebug = 0;
    yyparse();
}
%}

%token PRINT NUMBER COMMENT PLEASE

%%

commands: /* empty */
        |
        commands command
    ;

command: print number comment please {
        if ($3) {
            printf("%#x", $2);
        } else {
            printf("%d", $2);
        }
        printf("\n");
     }
     ;

print: comment PRINT
     ;

number: comment NUMBER {
        $ = $2;
      }
      ;

please: comment PLEASE
      ;

comment: /* empty */ {
            $ = 0;
       }
       |
        comment COMMENT {
            $ = 1;
        }
    ;

所以，正如你所看到的，这并不完全是火箭科学，但它确实有效。由于空字符串在多个位置与 comment 匹配，因此存在移位/归约冲突。此外，没有规则可以在最后的 please 和 EOF 之间添加注释。但总的来说，我认为这是一个很好的例子。

One solution may be to use bison's error recovery (see the Bison manual).

To summarize, bison defines the terminal token error to represent an error (say, a comment token returned in the wrong place). That way, you can (for example) close parentheses or braces after the wayward comment is found. However, this method will probably discard a certain amount of parsing, because I don't think bison can "undo" reductions. ("Flagging" the error, as with printing a message to stderr, is not related to this: you can have an error without printing an error--it depends on how you define yyerror.)

You may instead want to wrap each terminal in a special nonterminal:

term_wrap: comment TERM

This effectively does what you're scared to do (put in a comment in every single rule), but it does it in fewer places.

To force myself to eat my own dog food, I made up a silly language for myself. The only syntax is print <number> please, but if there's (at least) one comment (##) between the number and the please, it prints the number in hexadecimal, instead.

Like this:

print 1 please
1
## print 2 please
2
print ## 3 please
3
print 4 ## please
0x4
print 5 ## ## please
0x5
print 6 please ##
6

My lexer:

%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}

%%

print           return PRINT;
[[:digit:]]+    yylval = atoi(yytext); return NUMBER;
please          return PLEASE;
##              return COMMENT;

[[:space:]]+    /* ignore */
.               /* ditto */

and the parser:

%debug
%error-verbose
%verbose
%locations

%{
#include <stdio.h>
#include <string.h>

void yyerror(const char *str) {
        fprintf(stderr, "error: %s\n", str);
}

int yywrap() {
        return 1;
} 

extern int yydebug;
int main(void) {
    yydebug = 0;
    yyparse();
}
%}

%token PRINT NUMBER COMMENT PLEASE

%%

commands: /* empty */
        |
        commands command
    ;

command: print number comment please {
        if ($3) {
            printf("%#x", $2);
        } else {
            printf("%d", $2);
        }
        printf("\n");
     }
     ;

print: comment PRINT
     ;

number: comment NUMBER {
        $ = $2;
      }
      ;

please: comment PLEASE
      ;

comment: /* empty */ {
            $ = 0;
       }
       |
        comment COMMENT {
            $ = 1;
        }
    ;

So, as you can see, not exactly rocket science, but it does the trick. There's a shift/reduce conflict in there, because of the empty string matching comment in multiple places. Also, there's no rule to fit comments in between the final please and EOF. But overall, I think it's a good example.

回复收藏 0 原文