弹性和野牛$变量给出意外的价值

发布于 2025-01-24 17:15:53 字数 1109 浏览 3 评论 0原文

在我的Lexer文件中，我将令牌“ name”设置为“ yylval.str = yytext”。然后，在我的野牛文件中，我尝试读取该str值以将名称作为字符串获取。但是，当我阅读$ 2时，我最终不仅得到了令牌名称，还可以获得其余的行。

例如，一行可以是“移动z到xy”，其中z和xy都是两个名称。在这种情况下，我希望$ 2的价值为“ z”，而4美元的价值为“ xy”。但是实际发生的是$ 2的价值是“ z至xy”，而$ 4的值是“ xy”。我想4美元有同样的问题，但是在生产线结束时没有其他问题，因此不会引起任何问题。

为什么$ 2将这样的剩余时间付诸实践？我如何获得变量名称？

（缩短）Lexer代码：（

"MOVE"                  {return (MOVE);}
"TO"                    {return (TO);}
([0-9])+                {yylval.num = atoi(yytext); return (INTEGER);}
[a-z][a-z0-9\-]*        {yylval.str = _strlwr(yytext); return (NAME);}

缩短）解析器代码：

%token MOVE
%token TO
%token <num> INTEGER
%token <str> NAME
%union{
    int num;
    char *str;
}

move:
    MOVE NAME TO NAME PERIOD { printf("<Var1: %s>, <Var2: %s>", $2, $4);}
    | MOVE INTEGER TO NAME PERIOD { printf("<Val: %d>, <Var: %s>", $2, $4); }

输入：

MOVE Z TO XY-1
MOVE 15 TO XY-1

输出：

<Var1: z TO xy-1>, <Var2: xy-1>
<Val: 15>, <Var: xy-1>

原文

In my lexer file I set "yylval.str = yytext" for the token "name". Then in my bison file I try to read that str value to get the name as a string. However, when I read $2 I end up getting not only the token name, but also the rest of the line.

For example, a line could be "MOVE Z TO XY" where Z and XY are both names. In this instance, I would expect $2's value to be "Z" and $4's value to be "XY".
But what actually happens is $2's value is "Z TO XY", and $4's value is "XY". I imagine $4 has the same problem, but has nothing else at the end of the line so it doesn't cause any issue.

Why is $2 giving the entire remainder of the line like this? How do I just get the variable name?

(Shortened) Lexer code:

"MOVE"                  {return (MOVE);}
"TO"                    {return (TO);}
([0-9])+                {yylval.num = atoi(yytext); return (INTEGER);}
[a-z][a-z0-9\-]*        {yylval.str = _strlwr(yytext); return (NAME);}

(Shortened) Parser code:

%token MOVE
%token TO
%token <num> INTEGER
%token <str> NAME
%union{
    int num;
    char *str;
}

move:
    MOVE NAME TO NAME PERIOD { printf("<Var1: %s>, <Var2: %s>", $2, $4);}
    | MOVE INTEGER TO NAME PERIOD { printf("<Val: %d>, <Var: %s>", $2, $4); }

Input:

MOVE Z TO XY-1
MOVE 15 TO XY-1

Output:

<Var1: z TO xy-1>, <Var2: xy-1>
<Val: 15>, <Var: xy-1>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红ご颜醉 2025-01-31 17:15:53

在我的lexer文件中，我设置了“ yylval.str = yytext”的“名称”。然后，在我的野牛文件中，我尝试读取该str值以将名称作为字符串获取。但是，当我阅读$ 2时，我最终不仅得到了令牌名称，还可以得到其余的行。

这根本不足为奇。 yyText是输入缓冲区的指针（in），从当前匹配的位置开始。当解析器查看指向数据时，它们通常不是字符串，因为通常没有立即跟随令牌字符的字符串终结器（但请参见下文）。

此外，当解析器开始查看令牌的语义价值时，Lexer已将新数据读取到输入缓冲区中，并将原始令牌文本从您的下方倾斜。

我如何获得变量名？

要将令牌文本作为字符串，您可以在以后访问，并确保它不会从您的下面进行修改，则需要对其进行副本。这可能需要在动态分配的内存中。 仅在Lexer Action 中，您可以依靠Flex提供临时字符串终结器，因此您可以使用strdup（）（如果有）来制作这样的复制。您似乎正在使用Microsoft的C库，它确实具有strdup。

然后：

[a-z][a-z0-9\-]*        {
                            char *temp = strdup(yytext);
                            if (temp == NULL) { /* handle allocation error*/}
                            else {
                                yylval.str = _strlwr(temp); return (NAME);
                            }
                        }

您将需要确保在不再需要时，解析器的动态分配的语义值将被解析器释放，然后再丢失指针。

In my lexer file I set "yylval.str = yytext" for the token "name". Then in my bison file I try to read that str value to get the name as a string. However, when I read $2 I end up getting not only the token name, but also the rest of the line.

That's not at all surprising. yytext is a pointer (in)to the input buffer, starting at the position of the current match. By the time the parser looks at the pointed-to data, they typically are not a string, in the sense that there usually is not a string terminator immediately following the token characters (but see below).

Furthermore, it is possible that by the time the parser gets around to looking at the token's semantic value, the lexer has read new data into the input buffer, yanking the original token text right out from under you.

How do I just get the variable name?

To get the token text as a string that you can access later, and to make sure it doesn't get modified out from under you, you need to make a copy of it. That would probably need to be in dynamically-allocated memory. Inside the lexer action only, you can rely on Flex to have provided a temporary string terminator, so you may use strdup() (if you have it) to make such a copy. You appear to be using Microsoft's C library, which does have strdup.

Then:

[a-z][a-z0-9\-]*        {
                            char *temp = strdup(yytext);
                            if (temp == NULL) { /* handle allocation error*/}
                            else {
                                yylval.str = _strlwr(temp); return (NAME);
                            }
                        }

You will need to ensure that the dynamically allocated semantic values of your tokens are freed by the parser when they are no longer needed, before the pointers to them are lost.

回复收藏 0 原文

~没有更多了~