带引号的字符串的 Flex 操作返回空字符串
我正在尝试使用 Flex 手册 [1] 中显示的示例。该示例显示了可能包含八进制代码的带引号字符串的 Flex 规则。
该手册对收盘报价操作的描述有点不完整。它只有这样的评论:
/* return string constant token type and
* value to parser
*/
所以我创建了我认为可以工作的代码,但显然我的代码不正确。
下面是词法分析器和解析器。当我执行生成的解析器时,我得到以下输出:
The string is: ''
我期望和想要的是这个输出:
The string is: 'John Doe'
我的输入是这样的: "John Doe"
请问我做错了什么?
这是词法分析器:
%option noyywrap
%x STR
%{
#include "parse.tab.h"
#define MAX_STR_CONST 100
%}
%%
char string_buf[MAX_STR_CONST];
char *string_buf_ptr;
\" { string_buf_ptr = string_buf; BEGIN(STR); }
<STR>{
\" { /* closing quote - all done */
BEGIN(INITIAL);
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr);
return(STRING);
}
\n { /* error - unterminated string constant */
perror("Error - unterminated string");
yyterminate();
}
\\[0-7]{1,3} { /* octal escape sequence */
int result;
(void) sscanf(yytext+1, "%o", &result);
if (result > 0xff) {
perror("Error - octal escape is out-of-bounds");
yyterminate();
}
*string_buf_ptr++ = result;
}
\\[0-9]+ { /* bad escape sequence */
perror("Error - bad escape sequence");
yyterminate();
}
\\n *string_buf_ptr++ = '\n';
\\t *string_buf_ptr++ = '\t';
\\r *string_buf_ptr++ = '\r';
\\b *string_buf_ptr++ = '\b';
\\f *string_buf_ptr++ = '\f';
\\(.|\n) *string_buf_ptr++ = yytext[1];
[^\\\n\"]+ {
char *yptr = yytext;
while (*yptr)
*string_buf_ptr++ = *yptr++;
}
}
%%
这是解析器:
%{
#include <stdio.h>
#include <stdlib.h>
/* interface to the lexer */
extern int yylineno; /* from lexer */
int yylex(void);
void yyerror(const char *s, ...);
extern FILE *yyin;
int yyparse (void);
%}
%union {
char *strval;
}
%token <strval> STRING
%%
start
: STRING { printf("The string is: '%s'", $1);}
;
%%
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
yyparse();
fclose(yyin);
return 0;
}
void yyerror(const char *s, ...)
{
fprintf(stderr, "%d: %s\n", yylineno, s);
}
[1] 请参阅 Flex 手册中的第 24-25 页 https://epaperpress.com/lexandyacc/download/flex.pdf
I am trying to get working an example shown in the Flex manual [1]. The example shows Flex rules for a quoted string that may contain octal codes.
The manual is a bit incomplete in its description of the action for the closing quote. It simply has this comment:
/* return string constant token type and
* value to parser
*/
So I created code that I thought would work, but apparently my code is incorrect.
Below is the lexer followed by the parser. When I execute the generated parser, I get this output:
The string is: ''
What I expect, and want, is this output:
The string is: 'John Doe'
My input is this: "John Doe"
What am I doing wrong, please?
Here is the lexer:
%option noyywrap
%x STR
%{
#include "parse.tab.h"
#define MAX_STR_CONST 100
%}
%%
char string_buf[MAX_STR_CONST];
char *string_buf_ptr;
\" { string_buf_ptr = string_buf; BEGIN(STR); }
<STR>{
\" { /* closing quote - all done */
BEGIN(INITIAL);
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr);
return(STRING);
}
\n { /* error - unterminated string constant */
perror("Error - unterminated string");
yyterminate();
}
\\[0-7]{1,3} { /* octal escape sequence */
int result;
(void) sscanf(yytext+1, "%o", &result);
if (result > 0xff) {
perror("Error - octal escape is out-of-bounds");
yyterminate();
}
*string_buf_ptr++ = result;
}
\\[0-9]+ { /* bad escape sequence */
perror("Error - bad escape sequence");
yyterminate();
}
\\n *string_buf_ptr++ = '\n';
\\t *string_buf_ptr++ = '\t';
\\r *string_buf_ptr++ = '\r';
\\b *string_buf_ptr++ = '\b';
\\f *string_buf_ptr++ = '\f';
\\(.|\n) *string_buf_ptr++ = yytext[1];
[^\\\n\"]+ {
char *yptr = yytext;
while (*yptr)
*string_buf_ptr++ = *yptr++;
}
}
%%
Here is the parser:
%{
#include <stdio.h>
#include <stdlib.h>
/* interface to the lexer */
extern int yylineno; /* from lexer */
int yylex(void);
void yyerror(const char *s, ...);
extern FILE *yyin;
int yyparse (void);
%}
%union {
char *strval;
}
%token <strval> STRING
%%
start
: STRING { printf("The string is: '%s'", $1);}
;
%%
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
yyparse();
fclose(yyin);
return 0;
}
void yyerror(const char *s, ...)
{
fprintf(stderr, "%d: %s\n", yylineno, s);
}
[1] See page 24-25 in the Flex manual https://epaperpress.com/lexandyacc/download/flex.pdf
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的操作是:
很明显,
string_buf_ptr
的strdup
将返回一个新分配的空字符串副本,因为您刚刚设置了指向的字符string_buf_ptr
为 0。两条评论:
此外,
perror
旨在根据errno
的值向用户显示错误消息。这在这种情况下不是很有用;您可能想调用yyerror
。 (但是,您需要在词法分析器中声明它,除非您安排将其原型插入到parse.tab.h
中。请参阅%code require
/< code>%code 在 bison 手册中提供了 来说明如何做到这一点。)Your action is:
It seems pretty clear that
strdup
ofstring_buf_ptr
will return a newly-allocated copy of an empty string, since you just set the character pointed to bystring_buf_ptr
to 0.Two comments:
Also,
perror
is intended to present the user with an error message based on the value oferrno
. That's not very useful in this context; you probably want to callyyerror
. (However, you'll need to declare it in the lexer, unless you arrange for its prototype to be inserted inparse.tab.h
. See%code requires
/%code provides
in the bison manual for how to do that.)