yacc 是如何工作的,你能逐节解释一下吗?
这个小小的 yacc 程序是如何工作的?
到目前为止我所知道的:
%{...%}
是定义 %% ... %%
是规则,但是如何解释规则呢? %%
之后的内容是函数定义。 %}
和 %%
之间的 %token INTEGER
部分是什么?
%{
#include <stdlib.h>
int yylex(void);
void yyerror(char *);
%}
%token INTEGER
%left '+' '-'
%left '*' '/'
%%
program:
program expr '\n' { printf("%d\n", $2); }
|
;
expr:
INTEGER { $$ = $1; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
;
%%
void yyerror(char *s) {
printf("%s\n", s);
}
int main(void) {
yyparse();
return 0;
}
更新
我不明白的是:
program:
program expr '\n' { printf("%d\n", $2); }
|
;
How does this tiny yacc programe work?
What I know so far:
%{...%}
is definition%% ... %%
is rule,but how to interpret the rule?
and stuff after %%
is function definition.
What's the section %token INTEGER
between %}
and %%
?
%{
#include <stdlib.h>
int yylex(void);
void yyerror(char *);
%}
%token INTEGER
%left '+' '-'
%left '*' '/'
%%
program:
program expr '\n' { printf("%d\n", $2); }
|
;
expr:
INTEGER { $ = $1; }
| expr '*' expr { $ = $1 * $3; }
| expr '/' expr { $ = $1 / $3; }
| expr '+' expr { $ = $1 + $3; }
| expr '-' expr { $ = $1 - $3; }
;
%%
void yyerror(char *s) {
printf("%s\n", s);
}
int main(void) {
yyparse();
return 0;
}
UPDATE
What I don't understand:
program:
program expr '\n' { printf("%d\n", $2); }
|
;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
expr:
表示 expr 是以下选项之一,它们之间用 | 分隔。以下。如果它可以被视为一个INTEGER
令牌,那么它将采用第一个。如果它可以被视为一个expr
后跟一个*
字符,后跟一个expr
,那么它采用第二个选项,依此类推。$$
是默认返回值,$1
是第一个标记,$2
是第二个标记,依次类推。因此,如果它正在解析
5 + 6
,它会将其视为 expr '+' expr,因此它采用第四个定义。它返回 11 作为expr
,因此它会匹配 11 作为INTEGER
标记,并分配 11 作为返回值。如果我们解析一个程序标记,后跟
5 + 6
,它会做同样的事情来获取程序 11,然后采用程序 expr 规则并调用将打印到屏幕上的 C 代码。Left 表示该运算符是左结合的。如
a + b + c
=(a + b) + c
。同一行上的运算符具有相同的优先级,其下方的运算符优先级较低。诚然,我已经有一段时间没有使用 yacc 了,所以请随意告诉我我完全错了。
更新:
yacc 生成 c 代码,因此您可以将自己的代码直接放入其中。所以在解析时,如果它看到“program expr”,那么它可以直接将 { } 中的代码输入到生成的代码中。
The
expr:
means an expr is one of the following options, which are separated by | below. If it can be seen as anINTEGER
token, then it takes the first one. If it can be seen as anexpr
followed by the*
character followed by anexpr
then it takes the second option and so on.$$
is the default return value, and$1
is the first token,$2
the second on and on.So if it were parsing
5 + 6
, it sees it as expr '+' expr, so it takes the 4th definition. It returns 11 as anexpr
, so then it matches 11 as anINTEGER
token and assigns 11 as the return value.If we were parsing a program token followed by
5 + 6
, it would do the same thing to get program 11, then take the program expr rule and call the c code which would print to the screen.Left means that operator is left associative. As in
a + b + c
=(a + b) + c
. The operators on the same line have the same precedence, and those below it have lower precedence.I admittedly haven't used yacc in a while so feel free to tell me I'm totally wrong.
UPDATE:
yacc generates c code, so you can put your own code directly into it. So as it's parsing, if it sees a "program expr", then it can directly input the code in the { } into the generated code.
编译器总共有五个阶段,即:
lex 和 YACC(另一个编译器编译器)都是用于生成程序的 Unix 实用程序。lexer 负责匹配给定程序中的单词,当找到匹配时,它将值存储在 yylex 中并以标记的形式返回到词法分析器。 yacc 基本上是一个解析器,它获取标记并从中构建一棵树来检查程序的语法。标记由词法分析器构建,并在 yacc 规范文件中声明。
因此 y.tab.h 文件包含在 lex 程序中。
上面的程序包含一个由词法分析器返回的标记 INTEGER。 yacc 程序应包含一个开始状态...“程序”是开始状态每个程序应至少包含一个开始状态...yacc 规则的格式为规则 {action}
在上面的程序中,“program expr '\n'”表示程序可以包含表达式,然后包含换行符。即 5+4 'enter key' 表示我们编写了 expr '+' expr 的表达式;这里 expr 在本例中可以是整数 5 和 4,因此我们也通过编写 expr: INTEGER 来包含该规则
....对于添加或任何其他角色,表达式的 LHS 由 $$ 表示,RHS 由 $1 、$2 、 $3 ...等表示。因此 expr: expr '+' expr { $$ = $1 + $3;} ......
There are total five phases of compiler namely:
The lex and YACC (yet another compiler compiler ) are both the Unix utilities to generate a program.lexer is responsible for matching the words in given program and when the match is found it stores the values in yylex and return in the form of tokens to lexer. yacc is basically a parser which takes the tokens and builds a tree from it to check the syntax of the program.The tokens are build by lexer and they are declared in yacc specification file.
so the y.tab.h file is included in lex program .
The above program contains a token INTEGER which is returned by lexer. The yacc program should contain a start state ...the "program" is the start state every program should contain atleast one start state... The yacc rules are in format rule {action}
.In above program "program expr '\n' " represents that the program can contain expression and then a newline character. i.e 5+4 'enter key' represents an expression for that we have written a expr '+' expr ;here expr can be integer in this case 5 and 4 so we include that rule also by writing expr: INTEGER
....for addition or any other role the LHS of expression is represented by $$ and RHS by $1 ,$2 , $3...so on.hence expr: expr '+' expr { $$ = $1 + $3;} ......