bison 和 lex 字符串与 char

发布于 2024-08-07 04:33:04 字数 1453 浏览 4 评论 0原文

我正在尝试计算 expr 形式的表达式

#SomeFunc[expr][expr]expr

可以是由某些字符组成的字符串或上述函数。所以这可能看起来像这样

#SomeFunc[#SomeFunc[#SomeFunc[nm^2][nn]][nm]][n]...

问题是,如果我以以下形式插入令牌,那么我在

"#"SomeFunc    {yylval.fn=F_some; return FUNC;}
m|n|ms         {return TEXT;}
"^"            {yylval.fn=F_pow;  return FUNC;}
[1-9]+         {yylval=atoi(yytext); return NUMBER;}

构建语法时会遇到问题,如果我有类似的东西

#SomeFunc[#SomeFunc[nm^2][nn]][n]

calc:
      | calc expr EOL { eval($2); }

expr: TEXT {$$= add it to ast-leaf }
      | FUNC '[' expr ']' '[' expr ']' {$$= add ast(func,$3,$6) }
      | expr expr {$$= add to ast('*',$1,$2 }

,并且我不太确定语法是否错误或我的 AST 实现是否错误。

我发现我的逻辑有缺陷,因为在 nm expr 的情况下将是 expr expr 它将返回 n*m 的值,该值仍然是 nm。这会导致无限循环吗?我应该如何解析这样的表达式。

不要扔石头。 Bison新手

稍后编辑 我设法清理并测试 AST 和一些链表背后的代码。唯一的问题仍然是语法。

%union { struct ast *a; char *strval; int ival; } 
%type <a> exp fact 
%token <strval> ISU 
%token <ival> NUMBER 
%token FUNC POW 
%token EOL OP CP 

%% 

calclist: | calclist exp EOL { printf("result >",eval($2));}; 

exp: fact | exp fact {$$ = newast('*', $1,$2);} ; 

fact: FUNC OP exp CP OP exp CP { $$ = newast('/',$3,$6);}
    | ISU POW NUMBER { $$ = newnum($1, $3);}
    | ISU { $$ = newnum($1,1);};  

对于像 Frac[m^2][m^4] 节点 / 节点 K m^4 节点 K m^4 这样的表达式,此语法失败

I'm trying to evaluate and expression of the form

#SomeFunc[expr][expr]expr

expr can be either a string composed from certain characters or a function as above. So this could look something like

#SomeFunc[#SomeFunc[#SomeFunc[nm^2][nn]][nm]][n]...

The problem is that if I brake into tokens in the form of

"#"SomeFunc    {yylval.fn=F_some; return FUNC;}
m|n|ms         {return TEXT;}
"^"            {yylval.fn=F_pow;  return FUNC;}
[1-9]+         {yylval=atoi(yytext); return NUMBER;}

I have issues building a grammar if I have something like

#SomeFunc[#SomeFunc[nm^2][nn]][n]

calc:
      | calc expr EOL { eval($2); }

expr: TEXT {$= add it to ast-leaf }
      | FUNC '[' expr ']' '[' expr ']' {$= add ast(func,$3,$6) }
      | expr expr {$= add to ast('*',$1,$2 }

and I'm not quite sure if the grammar is wrong or my implementation of an AST.

I find my logic flawed because in the case of nm expr will be expr expr which will return the the value of n*m which is still nm. will this cause an infinite loop? How should i parse such an expression.

Don't throw stones. Bison newbie

Later edit
I managed to clean up and test the code behind the AST and some linked lists. The only problem remains the grammar.

%union { struct ast *a; char *strval; int ival; } 
%type <a> exp fact 
%token <strval> ISU 
%token <ival> NUMBER 
%token FUNC POW 
%token EOL OP CP 

%% 

calclist: | calclist exp EOL { printf("result >",eval($2));}; 

exp: fact | exp fact {$ = newast('*', $1,$2);} ; 

fact: FUNC OP exp CP OP exp CP { $ = newast('/',$3,$6);}
    | ISU POW NUMBER { $ = newnum($1, $3);}
    | ISU { $ = newnum($1,1);};  

This grammar fails for an expr like Frac[m^2][m^4] node / node K m^4 node K m^4

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

仅此而已 2024-08-14 04:33:04

我简化了语法以仅表达基本形式,而不一定表达它们的组合方式。 (为了简化实验,我还消除了生成的词法分析器,因此在我的函数中,所有函数都称为“f”,您可以使用任何您想要的数字,只要它是“2”即可。)

这似乎对我来说在各种测试中都很有效案例:请注意,除了 calc 之外,我的所有规则都是递归的,这正是 yacc 所希望的。

cat subcity.y && yacc subcity.y && cc -w y.tab.c -ly 
%%
calc: | calc expr '\n';
expr: | expr form
      | expr operator form;
form: mns | '[' expr ']' | digit | 'f';
mns: 'm' | 'n' | 's';
digit: '2';
operator: '^' | '+' |  '-' | '*' | '/';
%%
int yylex(void) { int c; while ((c = getchar()) == ' ') continue; return c; }
int main(int ac, char **av) { if (yyparse() != 0) printf("parse error\n"); }

它似乎有效:

$ ./a.out
f[ f[ f[nm^2] [nn]] [nm]] [n]
f[f[2]] [f[f[nm^2]f]]
f[f[nm^2][nn]][n]
f[m^2][m^2] n / n 2 m^2 n 2 n^2
$ 

我不太明白你不喜欢你的第一个语法的什么,但我希望这能给你一些想法。 肯定会从类似这样的事情开始。事实上,您的语法具有未通过运算符连接的相邻表达式,这有点奇怪。它在终端符号中更为常见,就像某些语言中字符串的连接方式一样。有些人会发明一个运算符来消除这种情况。

I simplified the grammar to express just the fundamental forms, and not necessarily the ways they might be combined. (I also eliminated the generated lexer for simplicity of experimentation, so in mine all functions are called 'f' and you can have any digit you want as long as it is '2'.)

This seems to work well for me on various test cases: Note that all my rules except calc are left recursive, which is what you want with yacc.

cat subcity.y && yacc subcity.y && cc -w y.tab.c -ly 
%%
calc: | calc expr '\n';
expr: | expr form
      | expr operator form;
form: mns | '[' expr ']' | digit | 'f';
mns: 'm' | 'n' | 's';
digit: '2';
operator: '^' | '+' |  '-' | '*' | '/';
%%
int yylex(void) { int c; while ((c = getchar()) == ' ') continue; return c; }
int main(int ac, char **av) { if (yyparse() != 0) printf("parse error\n"); }

It seems to work:

$ ./a.out
f[ f[ f[nm^2] [nn]] [nm]] [n]
f[f[2]] [f[f[nm^2]f]]
f[f[nm^2][nn]][n]
f[m^2][m^2] n / n 2 m^2 n 2 n^2
$ 

I couldn't quite figure out what you didn't like about your first grammar, but I'm hoping this gives you some ideas. Something more like this is certainly what I would start with. The fact that your grammar features adjacent expressions unconnected by an operator is a little odd. It's more common with terminal symbols, like the way strings are concatenated in some languages. Some people would invent an operator to eliminate this case.

半边脸i 2024-08-14 04:33:04

根据您的描述,您期望“^2”是有效的表达式,但您的 lex 规则返回“^”的 FUNC 和“2”的 NUMBER,但在您的语法中,FUNC 必须在唯一规则中后跟“[”你有它,但你没有 NUMBER 的规则。您可能需要一个规则“expr:NUMBER”,但是您还需要一个规则“expr:FUNC expr”来匹配“^2”,所以看起来您可能想让“^”返回一些其他值令牌。

From your description, you expect "^2" to be a valid expr, but your lex rule returns FUNC for '^' and NUMBER for '2', but in your grammar, FUNC must be followed by '[' in the only rule you have for it, and you have no rule for NUMBER. You probably want a rule "expr : NUMBER", but then you'll also need a rule "expr: FUNC expr" to then match "^2", so it seems like you might instead want to have '^' return some other token.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文