纠正 lex 和 yacc 中的一些简单逻辑错误

发布于 2024-10-09 12:09:00 字数 2509 浏览 7 评论 0原文

请帮助我解决我在示例中遇到的两个简单的逻辑错误。

以下是详细信息:

输入文件:(input.txt)


名字:James
姓氏:史密斯
普通文本


输出文件:(output.txt) - [有两个逻辑错误]


名称是:James
姓名是:姓氏:Smith
名称是:普通文本


我期望的输出(而不是上面的行)-[没有逻辑错误]


名称是:James
名字是:史密斯
普通文本


换句话说,我不希望将姓氏发送到输出,并且如果普通文本写在“名字:”或“姓氏:”之后,我也希望匹配普通文本。

这是我的 lex 文件 (example.l):

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "y.tab.h"

/* prototypes */ 
void yyerror(const char*); 

/* Variables: */
char *tempString;

%}

%START sBody

%%

"FirstName:"                {        BEGIN sBody;        }
"LastName:"                 {        BEGIN sBody;        }

.?                          {        return sNormalText; } 

\n                        /* Ignore end of line */;
[ \t]+                   /* Ignore whitespace */;

<sBody>.+   {
                tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
                strcpy(tempString, yytext);
                yylval.sValue = tempString;
                return sText;
             }
%%

int main(int argc, char *argv[]) 
{
    if ( argc < 3 )
    {
        printf("Please you need two args: inputFileName and outputFileName");
    }

    else 
    {
        yyin = fopen(argv[1], "r");
        yyout = fopen(argv[2], "w");
        yyparse();
        fclose(yyin);
        fclose(yyout);
    }
    return 0;
} 

这是我的 yacc 文件: (example.y):

%{
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    #include "y.tab.h"

    void yyerror(const char*); 
    int yywrap(); 

    extern FILE *yyout;

    %}

    %union 
    { 
        int iValue;     
        char* sValue;       
    }; 

    %token <sValue> sText
    %token <sValue> sNormalText

    %%

    StartName: /* for empty */
              | sName StartName
          ;

    sName:
         sText  
         { 
                fprintf(yyout, "The Name is: %s\n", $1);
         }
         |
         sNormalText
         {
               fprintf(yyout, "%s\n", $1);
         }
         ;    
    %%

    void yyerror(const char *str) 
    {
        fprintf(stderr,"error: %s\n",str);
    }

    int yywrap()
    {
        return 1;
    } 

如果您能帮助我纠正这些简单的逻辑错误,我将不胜感激。

预先感谢您的帮助和阅读我的帖子。

Please i need help in solving those two simple logic errors that i am facing in my example.

Here are the details:

The Input File: (input.txt)


FirstName:James
LastName:Smith
normal text


The output File: (output.txt) - [with two logic errors]


The Name is: James
The Name is: LastName:Smith
The Name is: normal text


What I am expecting as output (instead of the above lines) - [without logical errors]


The Name is: James
The Name is: Smith
normal text


In other words, i don't want the lastName to be sent to output, and i want to match normal text as well if it is written after the "FirstName:" or "LastName:".

Here is my lex File (example.l):

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "y.tab.h"

/* prototypes */ 
void yyerror(const char*); 

/* Variables: */
char *tempString;

%}

%START sBody

%%

"FirstName:"                {        BEGIN sBody;        }
"LastName:"                 {        BEGIN sBody;        }

.?                          {        return sNormalText; } 

\n                        /* Ignore end of line */;
[ \t]+                   /* Ignore whitespace */;

<sBody>.+   {
                tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
                strcpy(tempString, yytext);
                yylval.sValue = tempString;
                return sText;
             }
%%

int main(int argc, char *argv[]) 
{
    if ( argc < 3 )
    {
        printf("Please you need two args: inputFileName and outputFileName");
    }

    else 
    {
        yyin = fopen(argv[1], "r");
        yyout = fopen(argv[2], "w");
        yyparse();
        fclose(yyin);
        fclose(yyout);
    }
    return 0;
} 

Here is my yacc file: (example.y):

%{
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    #include "y.tab.h"

    void yyerror(const char*); 
    int yywrap(); 

    extern FILE *yyout;

    %}

    %union 
    { 
        int iValue;     
        char* sValue;       
    }; 

    %token <sValue> sText
    %token <sValue> sNormalText

    %%

    StartName: /* for empty */
              | sName StartName
          ;

    sName:
         sText  
         { 
                fprintf(yyout, "The Name is: %s\n", $1);
         }
         |
         sNormalText
         {
               fprintf(yyout, "%s\n", $1);
         }
         ;    
    %%

    void yyerror(const char *str) 
    {
        fprintf(stderr,"error: %s\n",str);
    }

    int yywrap()
    {
        return 1;
    } 

Please if you can help me out correcting those simple logical errors, i will be grateful.

Thanks in advance for your help and for reading my post.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

傲世九天 2024-10-16 12:09:00

部分问题是你进入状态“sBody”,但你永远不会回到初始状态 0。

另一个问题 - 还不是一个主要问题 - 是你使用右递归语法规则而不是(对于 Yacc 来说是自然的) ) 左递归规则:

StartName: /* empty */
      |    sName StartName
      ;


vs

StartName: /* empty */
      |    StartName sName
      ;

BEGIN 0; 添加到 Lex 规则改进了很多;剩下的麻烦是,对于普通文本中的每个字母,输出文件中都会多一行“Smith”。您需要检查如何将值返回到语法中。

通过在返回 sNormalText 的规则中的 return 之前添加 yylval.sValue = yytext;,我得到了“预期”输出。

example.l

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"

/* prototypes */
void yyerror(const char*);

/* Variables: */
char *tempString;

%}

%START sBody

%%

"FirstName:"                { puts("FN");      BEGIN sBody;        }
"LastName:"                 { puts("LN");      BEGIN sBody;        }

.?                          { printf("NT: %s\n", yytext); yylval.sValue = yytext; return sNormalText; }

\n                        /* Ignore end of line */;
[ \t]+                   /* Ignore whitespace */;

<sBody>.+   {
                tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
                strcpy(tempString, yytext);
                yylval.sValue = tempString;
                puts("SB");
                BEGIN 0;
                return sText;
             }

%%

int main(int argc, char *argv[])
{
    if ( argc < 3 )
    {
        printf("Please you need two args: inputFileName and outputFileName");
    }
    else
    {
        yyin = fopen(argv[1], "r");
        if (yyin == 0)
        {
            fprintf(stderr, "failed to open %s for reading\n", argv[1]);
            exit(1);
        }
        yyout = fopen(argv[2], "w");
        if (yyout == 0)
        {
            fprintf(stderr, "failed to open %s for writing\n", argv[2]);
            exit(1);
        }
        yyparse();
        fclose(yyin);
        fclose(yyout);
    }
    return 0;
}

example.y

%{
#include <stdio.h>
#include "y.tab.h"

void yyerror(const char*);
int yywrap();

extern FILE *yyout;

%}

%union
{
    char* sValue;
};

%token <sValue> sText
%token <sValue> sNormalText

%%

StartName: /* for empty */
          | StartName sName
      ;

sName:
     sText
     {
            fprintf(yyout, "The Name is: %s\n", $1);
     }
     |
     sNormalText
     {
           fprintf(yyout, "The Text is: %s\n", $1);
     }
     ;
%%

void yyerror(const char *str)
{
    fprintf(stderr,"error: %s\n",str);
}

int yywrap()
{
    return 1;
}

output.txt

The Name is: James
The Name is: Smith
The Text is: n
The Text is: o
The Text is: r
The Text is: m
The Text is: a
The Text is: l
The Text is:  
The Text is: t
The Text is: e
The Text is: x
The Text is: t

将 yywrap() 与词法分析器一起放入而不是与语法一起放入可能更有意义。我在代码中留下了简洁的调试打印 - 它们帮助我了解出了什么问题。

FN
SB
LN
SB
NT: n
NT: o
NT: r
NT: m
NT: a
NT: l
NT:  
NT: t
NT: e
NT: x
NT: t

您需要使用“.?”规则才能完整返回正常文本。您可能还需要在文件中移动它 - 启动状态是稍微特殊的生物。当我将规则更改为“.+”时,Flex 向我发出警告:

example.l:25: warning, rule cannot be matched
example.l:27: warning, rule cannot be matched

这些行引用了空白/制表符和 sBody 规则。在 sBody 规则之后移动不合格的“.+”删除了警告,但似乎没有达到预期的效果。玩得开心...

Part of the trouble is that you move into state 'sBody' but you never move back to the initial state 0.

Another problem - not yet a major one - is that you use a right-recursive grammar rule instead of the (natural for Yacc) left-recursive rule:

StartName: /* empty */
      |    sName StartName
      ;


vs

StartName: /* empty */
      |    StartName sName
      ;

Adding BEGIN 0; to the <sBody> Lex rule improves things a lot; the remaining trouble is that you get one more line 'Smith' in the output file for each single letter in the normal text. You need to review how the value is returned to your grammar.

By adding yylval.sValue = yytext; before the return in the rule that returns sNormalText, I got the 'expected' output.

example.l

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"

/* prototypes */
void yyerror(const char*);

/* Variables: */
char *tempString;

%}

%START sBody

%%

"FirstName:"                { puts("FN");      BEGIN sBody;        }
"LastName:"                 { puts("LN");      BEGIN sBody;        }

.?                          { printf("NT: %s\n", yytext); yylval.sValue = yytext; return sNormalText; }

\n                        /* Ignore end of line */;
[ \t]+                   /* Ignore whitespace */;

<sBody>.+   {
                tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
                strcpy(tempString, yytext);
                yylval.sValue = tempString;
                puts("SB");
                BEGIN 0;
                return sText;
             }

%%

int main(int argc, char *argv[])
{
    if ( argc < 3 )
    {
        printf("Please you need two args: inputFileName and outputFileName");
    }
    else
    {
        yyin = fopen(argv[1], "r");
        if (yyin == 0)
        {
            fprintf(stderr, "failed to open %s for reading\n", argv[1]);
            exit(1);
        }
        yyout = fopen(argv[2], "w");
        if (yyout == 0)
        {
            fprintf(stderr, "failed to open %s for writing\n", argv[2]);
            exit(1);
        }
        yyparse();
        fclose(yyin);
        fclose(yyout);
    }
    return 0;
}

example.y

%{
#include <stdio.h>
#include "y.tab.h"

void yyerror(const char*);
int yywrap();

extern FILE *yyout;

%}

%union
{
    char* sValue;
};

%token <sValue> sText
%token <sValue> sNormalText

%%

StartName: /* for empty */
          | StartName sName
      ;

sName:
     sText
     {
            fprintf(yyout, "The Name is: %s\n", $1);
     }
     |
     sNormalText
     {
           fprintf(yyout, "The Text is: %s\n", $1);
     }
     ;
%%

void yyerror(const char *str)
{
    fprintf(stderr,"error: %s\n",str);
}

int yywrap()
{
    return 1;
}

output.txt

The Name is: James
The Name is: Smith
The Text is: n
The Text is: o
The Text is: r
The Text is: m
The Text is: a
The Text is: l
The Text is:  
The Text is: t
The Text is: e
The Text is: x
The Text is: t

It might make more sense to put yywrap() in with the lexical analyzer rather than with the grammar. I've left the terse debugging prints in the code - they helped me see what was going wrong.

FN
SB
LN
SB
NT: n
NT: o
NT: r
NT: m
NT: a
NT: l
NT:  
NT: t
NT: e
NT: x
NT: t

You'll need to play with the '.?' rule to get normal text returned in its entirety. You may also have to move it around the file - start states are slightly peculiar critters. When I changed the rule to '.+', Flex gave me the warning:

example.l:25: warning, rule cannot be matched
example.l:27: warning, rule cannot be matched

These lines referred to the blank/tab and sBody rules. Moving the unqualified '.+' after the sBody rule removed the warnings, but didn't seem to do what was wanted. Have fun...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文