Bison/Yacc 语法中的无意串联
我正在尝试 lex 和 yacc 并遇到了一个奇怪的问题,但我认为最好在详细说明问题之前向您展示我的代码。这是我的词法分析器:
%{
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
void yyerror(char *);
%}
%%
[a-zA-Z]+ {
yylval.strV = yytext;
return ID;
}
[0-9]+ {
yylval.intV = atoi(yytext);
return INTEGER;
}
[\n] { return *yytext; }
[ \t] ;
. yyerror("invalid character");
%%
int yywrap(void) {
return 1;
}
这是我的解析器:
%{
#include <stdio.h>
int yydebug=1;
void prompt();
void yyerror(char *);
int yylex(void);
%}
%union {
int intV;
char *strV;
}
%token INTEGER ID
%%
program: program statement EOF { prompt(); }
| program EOF { prompt(); }
| { prompt(); }
;
args: /* empty */
| args ID { printf(":%s ", $<strV>2); }
;
statement: ID args { printf("%s", $<strV>1); }
| INTEGER { printf("%d", $<intV>1); }
;
EOF: '\n'
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
void prompt() {
printf("> ");
}
int main(void) {
yyparse();
return 0;
}
一种非常简单的语言,仅由字符串、整数和基本 REPL 组成。现在,您将在解析器中注意到 args 输出时带有前导冒号,其目的是,当与语句的规则的第一个模式组合时,与 REPL 的交互看起来像这样:
> aaa aa a
:aa :a aaa>
但是,交互是这样的:
> aaa aa a
:aa :a aaa aa aa
>
为什么以下规则中的令牌 ID
statement: ID args { printf("%s", $<strV>1); }
| INTEGER { printf("%d", $<intV>1); }
;
具有总输入字符串(包括换行符)的语义值?如何修改我的语法以实现我想要的交互?
I am experimenting with lex and yacc and have run into a strange issue, but I think it would be best to show you my code before detailing the issue. This is my lexer:
%{
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
void yyerror(char *);
%}
%%
[a-zA-Z]+ {
yylval.strV = yytext;
return ID;
}
[0-9]+ {
yylval.intV = atoi(yytext);
return INTEGER;
}
[\n] { return *yytext; }
[ \t] ;
. yyerror("invalid character");
%%
int yywrap(void) {
return 1;
}
This is my parser:
%{
#include <stdio.h>
int yydebug=1;
void prompt();
void yyerror(char *);
int yylex(void);
%}
%union {
int intV;
char *strV;
}
%token INTEGER ID
%%
program: program statement EOF { prompt(); }
| program EOF { prompt(); }
| { prompt(); }
;
args: /* empty */
| args ID { printf(":%s ", lt;strV>2); }
;
statement: ID args { printf("%s", lt;strV>1); }
| INTEGER { printf("%d", lt;intV>1); }
;
EOF: '\n'
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
void prompt() {
printf("> ");
}
int main(void) {
yyparse();
return 0;
}
A very simple language, consisting of no more than strings and integer and a basic REPL. Now, you'll note in the parser that args are output with a leading colon, the intention being that, when combined with the first pattern of the rule of the statement the interaction with the REPL would look something like this:
> aaa aa a
:aa :a aaa>
However, the interaction is this:
> aaa aa a
:aa :a aaa aa aa
>
Why does the token ID in the following rule
statement: ID args { printf("%s", lt;strV>1); }
| INTEGER { printf("%d", lt;intV>1); }
;
have the semantic value of the total input string, newline included? How can my grammar be reworked so that the interaction I intended?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您希望令牌字符串保持有效,则必须在读取令牌字符串时保留它们。我将
statement
规则修改为:然后,根据您的输入,我得到输出:
请注意,在读取初始 ID 时,令牌正是您所期望的。但是,由于您没有保留令牌,因此当您在解析 args 后返回打印字符串时,该字符串已被修改。
You have to preserve token strings as they are read if you want them to remain valid. I modified the
statement
rule to read:Then, with your input, I get the output:
Note that at the time the initial ID is read, the token is exactly what you expected. But, because you did not preserve the token, the string has been modified by the time you get back to printing it after the
args
have been parsed.我认为参数和语句产生之间存在关联性冲突。
bison -v
parser.output 文件的(部分)输出证实了这一点:确实,我很难弄清楚您的语法试图接受什么。作为旁注,我可能会将您的 EOF 生成作为 EOL 令牌移至词法分析器中;这将使重新同步解析错误变得更加容易。
更好地解释您的意图将会有所帮助。
I think there is an associativity conflict between the args and statement productions. This is borne out by the (partial) output from the
bison -v
parser.output file:Indeed, I'm having a hard time trying to figure out what your grammar is trying to accept. As a side note, I'd probably move your EOF production into the lexer as an EOL token; this will make resynchronizing on parse errors easier.
Better explanation of your intent would be helpful.