Lex 中的操作可以访问各个正则表达式组吗?
Lex 中的操作可以访问各个正则表达式组吗?
(注意:我猜不是,因为组字符 - 括号 - 是根据 文档 用于更改优先级。但如果是这样,您是否推荐可以执行此操作的替代 C/C++ 扫描器生成器?我不太热衷于编写自己的词法分析器。)
示例:
假设我有这个输入: foo [tagName attribute="value"] bar
我想使用 Lex/Flex 提取标签。我当然可以写这个规则:
\[[a-z]+[[:space:]]+[a-z]+=\"[a-z]+\"\] printf("matched %s", yytext);
但是假设我想要访问字符串的某些部分,例如属性,但不必再次解析 yytext (因为字符串已经被扫描过,所以扫描部分字符串实际上没有意义)再说一遍)。所以像这样的东西会更好(正则表达式组):
\[[a-z]+[[:space:]]+[a-z]+=\"([a-z]+)\"\] printf("matched attribute %s", $1);
Can actions in Lex access individual regex groups?
(NOTE: I'm guessing not, since the group characters - parentheses - are according to the documentation used to change precedence. But if so, do you recommend an alternative C/C++ scanner generator that can do this? I'm not really hot on writing my own lexical analyzer.)
Example:
Let's say I have this input: foo [tagName attribute="value"] bar
and I want to extract the tag using Lex/Flex. I could certainly write this rule:
\[[a-z]+[[:space:]]+[a-z]+=\"[a-z]+\"\] printf("matched %s", yytext);
But let's say I would want to access certain parts of the string, e.g. the attribute but without having to parse yytext again (as the string has already been scanned it doesn't really make sense to scan part of it again). So something like this would be preferable (regex groups):
\[[a-z]+[[:space:]]+[a-z]+=\"([a-z]+)\"\] printf("matched attribute %s", $1);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以将其分开以启动条件。像这样的事情:
关于替代的 C/C++ 扫描仪生成器 - 我使用 QT 类 QRegularExpression 来做同样的事情,它可以很容易地在匹配后获取正则表达式组。
You can separate it to start conditions. Something like this:
About an alternative C/C++ scanner generator - I use QT class QRegularExpression for same things, it can very easy get regex group after match.
当然,至少其中某些形式是这样的。
但可从 sourceforge.org 下载的默认 lex/flex 似乎并未在其文档中列出,并且此示例将完整字符串保留在 yytext 中。
来自 IBM 针对 AIX 的 LEX 文档:
Certainly at least some forms of them do.
But the default lex/flex downloadable from sourceforge.org do not seem to list it in their documentation, and this example leaves the full string in yytext.
From IBM's LEX documentation for AIX: