如何避免使用 ANTLR3 构建中间和无用的 AST 节点?
我编写了一个 ANTLR3 语法,该语法细分为更小的规则以提高可读性。 例如:
messageSequenceChart:
'msc' mscHead bmsc 'endmsc' end
;
# Where mscHead is a shortcut to :
mscHead:
mscName mscParameterDecl? timeOffset? end
mscInstInterface? mscGateInterface
;
我知道内置的 ANTLR AST 构建功能允许用户声明不会出现在最终 AST 中的中间 AST 节点。但是如果您手动构建 AST 会怎样?
messageSequenceChart returns [msc::MessageSequenceChart* n = 0]:
'msc' mscHead bmsc'endmsc' end
{
$n = new msc::MessageSequenceChart(/* mscHead subrules accessors like $mscHead.mscName.n ? */
$bmsc.n);
}
;
mscHead:
mscName mscParameterDecl? timeOffset? end
;
文档中没有谈论这样的事情。因此,看起来我必须为每个中间规则创建节点才能访问其子规则结果。
有谁知道更好的解决方案?
谢谢。
I wrote an ANTLR3 grammar subdivided into smaller rules to increase readability.
For example:
messageSequenceChart:
'msc' mscHead bmsc 'endmsc' end
;
# Where mscHead is a shortcut to :
mscHead:
mscName mscParameterDecl? timeOffset? end
mscInstInterface? mscGateInterface
;
I know the built-in ANTLR AST building feature allows the user to declare intermediate AST nodes that won't be in the final AST. But what if you build the AST by hand?
messageSequenceChart returns [msc::MessageSequenceChart* n = 0]:
'msc' mscHead bmsc'endmsc' end
{
$n = new msc::MessageSequenceChart(/* mscHead subrules accessors like $mscHead.mscName.n ? */
$bmsc.n);
}
;
mscHead:
mscName mscParameterDecl? timeOffset? end
;
The documentation does not talk about such a thing. So it looks like I will have to create nodes for every intermediate rules to be able to access their subrules result.
Does anyone know a better solution ?
Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以通过让子规则返回多个值并仅访问您感兴趣的值来解决此问题。
以下演示演示了如何执行此操作。虽然它不是用 C 语言编写的,但我相信您能够调整它以满足您的需求:
如果您使用生成的解析器解析输入
"12 34 56"
,< code>second=34 被打印到控制台,正如您在运行后所看到的:因此,来自
parse
规则的快捷方式,如$sub.INT
,或$sub.$a
到不幸的是,访问三个INT
令牌之一,不可能。You can solve this by letting your sub-rule(s) return multiple values and accessing only those you're interested in.
The following demo shows how to do it. Although it is not in C, I am confident that you'll be able to adjust it so that it fits your needs:
And if your parse the input
"12 34 56"
with the generated parser,second=34
is printed to the console, as you can see after running:So, a shortcut from the
parse
rule like$sub.INT
, or$sub.$a
to access one of the threeINT
tokens, in not possible, unfortunately.