Lex - 识别标记

发布于 2024-12-26 22:41:51 字数 1718 浏览 2 评论 0原文

我正在努力学习 Lex。我有一个简单的程序,我想在其中读取文件并识别令牌。

现在我遇到了一些错误。我认为我遇到了问题,因为文件中有多于一行来识别令牌?

这是文件

fd 3x00
bk
setc 100
int xy3 fd 10 rt 90

这是我想要实现的输出:

Keyword: fd
Illegal: 3x00
Keyword: bk
Keyword: setc
Number: 100
Keyword: int

这是我的程序:

%{

/* Comment  */

 #include <stdio.h>
 #include <stdlib.h>

%}
%%
fd                     {printf("Keyword: fd\n");}
[0-9][a-z][0-9]        {printf("Illegal: 3x00\n");}
bk                     {printf("Keyword: bk\n");}
setc[0-9]              {printf("Keyword: setc\n Number: %s\n", yytext);}
int                    {printf("Keyword: int\n");}
xy3                    {printf("ID: xy3\n");}
fd[0-9]                {printf("Keyword: fd\n Number %s\n", yytext);}
rt[0-9]                {printf("Keyword: rt \n Number %s\n", yytext);}
%%

main( argc, argv)
int argc;
char** argv;
{
    if(argc > 1)
    {

        FILE *file;
        file = fopen(argv[1], "r");
        if(!file)
        {
           fprintf(stderr, "Could not open %s \n", argv[1]);
           exit(1);
        }

        yyin = file;

    }

    yylex();

}

这是我尝试编译它时遇到的错误:

 In function 'yylex':
miniStarLogo.l:11: error: expected expression before '[' token
miniStarLogo.l:11: error: 'a' undeclared (first use in this function)
miniStarLogo.l:11: error: (Each undeclared identifier is reported only once
miniStarLogo.l:11: error: for each function it appears in.)
miniStarLogo.l:11: error: expected ';' before '{' token
miniStarLogo.l:13: error: expected expression before '[' token
miniStarLogo.l:13: error: expected ';' before '{' token

错误在我的 printf 语句中吗? 谢谢

I am trying to learn Lex. I have a simple program where i want to read in a file and recognize tokens.

Right now i am getting some errors. I think i am having problems because there is more than one line in the file to recognize the tokens?

Here is the file

fd 3x00
bk
setc 100
int xy3 fd 10 rt 90

here is the output i am trying to achieve:

Keyword: fd
Illegal: 3x00
Keyword: bk
Keyword: setc
Number: 100
Keyword: int

here is my program:

%{

/* Comment  */

 #include <stdio.h>
 #include <stdlib.h>

%}
%%
fd                     {printf("Keyword: fd\n");}
[0-9][a-z][0-9]        {printf("Illegal: 3x00\n");}
bk                     {printf("Keyword: bk\n");}
setc[0-9]              {printf("Keyword: setc\n Number: %s\n", yytext);}
int                    {printf("Keyword: int\n");}
xy3                    {printf("ID: xy3\n");}
fd[0-9]                {printf("Keyword: fd\n Number %s\n", yytext);}
rt[0-9]                {printf("Keyword: rt \n Number %s\n", yytext);}
%%

main( argc, argv)
int argc;
char** argv;
{
    if(argc > 1)
    {

        FILE *file;
        file = fopen(argv[1], "r");
        if(!file)
        {
           fprintf(stderr, "Could not open %s \n", argv[1]);
           exit(1);
        }

        yyin = file;

    }

    yylex();

}

here are the errors i am getting when i try to compile it:

 In function 'yylex':
miniStarLogo.l:11: error: expected expression before '[' token
miniStarLogo.l:11: error: 'a' undeclared (first use in this function)
miniStarLogo.l:11: error: (Each undeclared identifier is reported only once
miniStarLogo.l:11: error: for each function it appears in.)
miniStarLogo.l:11: error: expected ';' before '{' token
miniStarLogo.l:13: error: expected expression before '[' token
miniStarLogo.l:13: error: expected ';' before '{' token

is the error in my printf statements?
Thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

﹏雨一样淡蓝的深情 2025-01-02 22:41:51

当我使用 flex (2.5.35) 和 gcc (4.6.1) 在 MacOS X (10.7.2) 上编译您的代码副本时,我得到的唯一抱怨来自 C 编译器的内容是关于 main() 的非原型定义,这是因为我总是在启用该警告的情况下进行编译,并提及已定义但未使用的 yyunput() (这不是你的 过错)。

由于您正在学习 C,因此您应该只使用符号:

int main(int argc, char **argv)
{
    ...
}

或等效符号。

我还将 miniStarLogo.l 文件转换为 DOS 格式(CRLF 行结尾),并且 flexgcc 的结果似乎都不错- 有点令我惊讶。您的机器上可能不是这种情况。

当我在你的测试数据上运行代码时,我得到:

Keyword: fd
 Illegal: 3x00
0
Keyword: bk

setc 100
Keyword: int
 ID: xy3
 Keyword: fd
 10 rt 90

所以,根据我的计算,你离你需要的地方并不远。


我的命令一片混乱。

我使用了(嗯,它是 GCC 4.2.1 而不是 4.6.1),但是:(

$ flex miniStarLogo.l
$ gcc -Wall -Wextra -O3 -g -o lex.yy lex.yy.c -lfl
miniStarLogo.l:22: warning: return type defaults to ‘int’
miniStarLogo.l: In function ‘main’:
miniStarLogo.l:42: warning: control reaches end of non-void function
miniStarLogo.l: At top level:
lex.yy.c:1114: warning: ‘yyunput’ defined but not used
$ ./lex.yy <<EOF
> fd 3x00
> bk
> setc 100
> int xy3 fd 10 rt 90
> EOF
Keyword: fd
 Illegal: 3x00
0
Keyword: bk

setc 100
Keyword: int
 ID: xy3
 Keyword: fd
 10 rt 90
$

好吧 - 我稍微作弊了:第一次,我运行了 rmk lex.yy LDLIBS=-lfl,其中 rmk 是 make 的变体,目录中的编译规则使用显示的命令行,但我重新编译以获得正确的错误消息,与此完全相同。上面。)

您可能需要考虑扩展您的模式以接受“一个或多个”数字,用 [0-9]+ 代替 [0-9] 。您可能需要查看处理不匹配字符的规则。就我个人而言,我竭尽全力避免在换行符之前出现空白,因此您需要加强打印格式以满足我的标准。然而,这与程序的运行无关。

另外,如果您需要将文件从 DOS 转换为 Unix 行结尾,最简单的方法是使用 dos2unix 命令(如果您的计算机上有该命令)。否则,使用:

$ tr -d '\015' < miniStarLogo.l > x
$ od -c x
0000000   %   {  \r  \n  \r  \n   /   *       C   o   m   m   e   n   t
...
0001560  \n   }  \r  \n
0001564
$ mv x miniStarLogo.l
$

我使用 vim:set fileformat=dos 小心地添加了回车符;也可以使用 vim 和 :set fileformat=unix 来撤消它。这是 Unix,所以 TMTOWTDI(有不止一种方法可以做到这一点——Perl 的座右铭),我什至没有尝试使用 Perl。

When I compiled a copy of your code on MacOS X (10.7.2) with flex (2.5.35) and gcc (4.6.1), the only complaints I got from the C compiler were about the non-prototype definition of main(), and that was because I always compile with that warning enabled and mention of yyunput() defined but not used (which is not your fault).

Since you're learning C, you should only be using the notation:

int main(int argc, char **argv)
{
    ...
}

or an equivalent.

I also converted the miniStarLogo.l file to DOS format (CRLF line endings), and both flex and gcc seemed to be OK with the results - somewhat to my surprise. It might not be the case on your machine.

When I ran the code on your test data, I got:

Keyword: fd
 Illegal: 3x00
0
Keyword: bk

setc 100
Keyword: int
 ID: xy3
 Keyword: fd
 10 rt 90

So, you are not far off where you need to be by my reckoning.


Confusion reigneth over my commands.

I used (hmmm, it was GCC 4.2.1 rather than 4.6.1), but:

$ flex miniStarLogo.l
$ gcc -Wall -Wextra -O3 -g -o lex.yy lex.yy.c -lfl
miniStarLogo.l:22: warning: return type defaults to ‘int’
miniStarLogo.l: In function ‘main’:
miniStarLogo.l:42: warning: control reaches end of non-void function
miniStarLogo.l: At top level:
lex.yy.c:1114: warning: ‘yyunput’ defined but not used
$ ./lex.yy <<EOF
> fd 3x00
> bk
> setc 100
> int xy3 fd 10 rt 90
> EOF
Keyword: fd
 Illegal: 3x00
0
Keyword: bk

setc 100
Keyword: int
 ID: xy3
 Keyword: fd
 10 rt 90
$

(OK - I cheated marginally: the first time around, I ran rmk lex.yy LDLIBS=-lfl, where rmk is a variant of make and the compilation rules in the directory use the command line shown. But I redid the compilations to get the error messages right, exactly as above.)

You might need to look at expanding your patterns to accept 'one or more' digits with [0-9]+ in place of just [0-9]. You might need to look at a rule dealing with unmatched characters. And personally, I go to great pains to avoid blanks immediately before newlines, so you would need to tighten up your print formatting to meet my criteria. However, that's not germane to getting the program running.

Also, if you need to convert your file from DOS to Unix line endings, the easiest is the dos2unix command, if you have it on your machine. Otherwise, use:

$ tr -d '\015' < miniStarLogo.l > x
$ od -c x
0000000   %   {  \r  \n  \r  \n   /   *       C   o   m   m   e   n   t
...
0001560  \n   }  \r  \n
0001564
$ mv x miniStarLogo.l
$

I carefully added the carriage returns using vim and :set fileformat=dos; it would also be possible to undo it with vim and :set fileformat=unix. This is Unix so TMTOWTDI (There's More Than One Way To Do It -- the Perl motto), and I'm not even trying to use Perl.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文