ANTLR:获取代币名称?
我有一个语法规则,
OR
: '|';
但是当我使用打印 AST 时,
public static void Preorder(ITree tree, int depth)
{
if (tree == null)
{
return;
}
for (int i = 0; i < depth; i++)
{
Console.Write(" ");
}
Console.WriteLine(tree);
for(int i=0; i<tree.ChildCount; ++i)
Preorder(tree.GetChild(i), depth + 1);
}
(感谢 Bart)它显示实际的|
字符。有没有办法让它说“OR”?
I've got a grammar rule,
OR
: '|';
But when I print the AST using,
public static void Preorder(ITree tree, int depth)
{
if (tree == null)
{
return;
}
for (int i = 0; i < depth; i++)
{
Console.Write(" ");
}
Console.WriteLine(tree);
for(int i=0; i<tree.ChildCount; ++i)
Preorder(tree.GetChild(i), depth + 1);
}
(Thanks Bart) it displays the actual |
character. Is there a way I can get it to say "OR" instead?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
罗伯特启发了这个答案。
robert inspired this answer.
几周前我必须使用 Python ANTLR 来完成此操作。它对你没有多大帮助,但可能会帮助其他人寻找答案。
对于 Python ANTLR,标记类型是整数。令牌文本包含在令牌对象中。这是我使用的解决方案:
令牌的编号没有明显的逻辑(至少使用Python ANTLR),并且令牌名称不存储为字符串,除了模块
__dict__
之外,所以这是接触他们的唯一方法。我猜想在 C# 中标记类型位于枚举中,并且我相信枚举可以作为字符串打印。但这只是一个猜测。
I had to do this a couple of weeks ago, but with the Python ANTLR. It doesn't help you much, but it might help somebody else searching for an answer.
With Python ANTLR, tokens types are integers. The token text is included in the token object. Here's the solution I used:
There's no apparent logic to the numbering of tokens (at least, with Python ANTLR), and the token names are not stored as strings except in the module
__dict__
, so this is the only way of getting to them.I would guess that in C# token types are in an enumeration, and I believe enumerations can be printed as strings. But that's just a guess.
天哪,我花了太多时间把头撞在墙上试图弄清楚这个问题。 Mark 的回答给了我所需的提示,看起来下面的代码将从 Antlr 4.5 中的 TerminalNode 获取令牌名称:
或者在 C# 中:(
看起来您实际上可以从解析器或词法分析器获取词汇表。)
这些词汇方法似乎是 Antlr 4.5 中获取令牌的首选方式,而 tokenNames 似乎已被弃用。
对于我认为非常基本的操作来说,它确实看起来不必要地复杂,所以也许有一种更简单的方法。
Boy, I spent way too much time banging my head against a wall trying to figure this out. Mark's answer gave me the hint I needed, and it looks like the following will get the token name from a TerminalNode in Antlr 4.5:
or, in C#:
(Looks like you can actually get the vocabulary from either the parser or the lexer.)
Those vocabulary methods seem to be the preferred way get at the tokens in Antlr 4.5, and tokenNames appears to be deprecated.
It does seem needlessly complicated for what I think is a pretty basic operation, so maybe there's an easier way.
我是 Antlr 的新手,但似乎
ITree
没有直接义务与Parser
(在 .NET 中)相关。相反,有一个派生接口IParseTree
,从Parser
(在Antlr4中)返回,并且它包含一些附加方法,包括覆盖:它将整个节点子树转换为文本表示。对于某些情况它是有用的。如果您只想查看某个具体节点的名称而不包含其子节点,请在类
Trees
中使用静态方法:此方法与 Mark 和 基本相同罗伯特建议,但以更普遍和灵活的方式。
I'm new to Antlr, but it seems
ITree
has no direct obligation to be related toParser
(in .NET). Instead there is a derived interfaceIParseTree
, returned fromParser
(in Antlr4), and it contains few additional methods including override:It converts the whole node subtree into text representation. For some cases it is useful. If you like to see just the name of some concrete node without it's children, then use static method in class
Trees
:This method does basically the same as Mark and Robert suggested, but in more general and flexible way.
除了罗伯特的Pythonic答案(希望对其他语言有用):
如果使用生成的词法分析器的 nextToken() 方法,您可以使用词法分析器的“type”属性(不是令牌,不够直观)来获取词法分析器赋予标记类型的数字代码。在词法分析器本身中,您可以看到哪种类型获得哪个数字。希望这有帮助。
In addition to robert's pythonic answer (and hopefully will be useful for other languages):
If using the nextToken() method of your generated lexer, you can use the 'type' property of the lexer (not the token, unintuitively enough) to get the numeric code given to the token type by the lexer. In the lexer itself you can see which type got which number. Hope this is helpful.