如何用两个新标记替换词法分析器标记？

发布于 2025-01-09 04:13:54 字数 842 浏览 5 评论 0原文

在 XML 中，空元素可以用以下任一方式表示：

<foo></foo>
<foo/>

如果输入包含后者，那么我想像前者一样对其进行标记。

也就是说，如果输入是那么我希望词法分析器生成这个 (token kind, token value) 对序列：

('<', '<')
("foo", STAG)
('>', '>')
("</foo>", ETAG)

我尝试了这个（其中是独占状态，st 是保存元素名称的全局变量，在本例中为 "foo"） :

<START_TAG>{
   "/>"    { yytext = ">";
             return(">");
             yytext = strcat(strcat("<", st), ">");
             yyval.strval = strdup(yytext);
             yy_pop_state();
             return(ETAG); 
           }
}

但它不起作用。

本质上，我希望词法分析器用以下两个标记替换此标记 "/>"：">" 和 "".我该怎么做？

原文

In XML an empty element can be represented in either of these ways:

<foo></foo>
<foo/>

If the input contains the latter, then I want to tokenize it like the former.

That is, if the input is <foo/> then I want the lexer to generate this sequence of (token kind, token value) pairs:

('<', '<')
("foo", STAG)
('>', '>')
("</foo>", ETAG)

I tried this (where <START_TAG> is an exclusive state and st is a global variable holding the element name, which is "foo" in this example):

<START_TAG>{
   "/>"    { yytext = ">";
             return(">");
             yytext = strcat(strcat("<", st), ">");
             yyval.strval = strdup(yytext);
             yy_pop_state();
             return(ETAG); 
           }
}

but it doesn't work.

Essentially I want the lexer to replace this token "/>" with these two tokens: ">" and "</foo>". How do I do that?

分享到QQ

分享到微博