ANTLR:获取代币名称?

发布于 2024-10-06 14:06:13 字数 533 浏览 4 评论 0原文

我有一个语法规则,

OR
    : '|';

但是当我使用打印 AST 时,

public static void Preorder(ITree tree, int depth)
{
    if (tree == null)
    {
        return;
    }

    for (int i = 0; i < depth; i++)
    {
        Console.Write("  ");
    }

    Console.WriteLine(tree);

    for(int i=0; i<tree.ChildCount; ++i)
        Preorder(tree.GetChild(i), depth + 1);
}

(感谢 Bart)它显示实际的| 字符。有没有办法让它说“OR”?

I've got a grammar rule,

OR
    : '|';

But when I print the AST using,

public static void Preorder(ITree tree, int depth)
{
    if (tree == null)
    {
        return;
    }

    for (int i = 0; i < depth; i++)
    {
        Console.Write("  ");
    }

    Console.WriteLine(tree);

    for(int i=0; i<tree.ChildCount; ++i)
        Preorder(tree.GetChild(i), depth + 1);
}

(Thanks Bart) it displays the actual | character. Is there a way I can get it to say "OR" instead?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

最舍不得你 2024-10-13 14:06:13

罗伯特启发了这个答案。

if (ExpressionParser.tokenNames[tree.Type] == tree.Text)
    Console.WriteLine(tree.Text);
else
    Console.WriteLine("{0} '{1}'", ExpressionParser.tokenNames[tree.Type], tree.Text);

robert inspired this answer.

if (ExpressionParser.tokenNames[tree.Type] == tree.Text)
    Console.WriteLine(tree.Text);
else
    Console.WriteLine("{0} '{1}'", ExpressionParser.tokenNames[tree.Type], tree.Text);
2024-10-13 14:06:13

几周前我必须使用 Python ANTLR 来完成此操作。它对你没有多大帮助,但可能会帮助其他人寻找答案。

对于 Python ANTLR,标记类型是整数。令牌文本包含在令牌对象中。这是我使用的解决方案:

import antlrGeneratedLexer

token_names = {}
for name, value in antlrGeneratedLexer.__dict__.iteritems():
    if isinstance(value, int) and name == name.upper():
        token_names[value] = name

令牌的编号没有明显的逻辑(至少使用Python ANTLR),并且令牌名称不存储为字符串,除了模块__dict__之外,所以这是接触他们的唯一方法。

我猜想在 C# 中标记类型位于枚举中,并且我相信枚举可以作为字符串打印。但这只是一个猜测。

I had to do this a couple of weeks ago, but with the Python ANTLR. It doesn't help you much, but it might help somebody else searching for an answer.

With Python ANTLR, tokens types are integers. The token text is included in the token object. Here's the solution I used:

import antlrGeneratedLexer

token_names = {}
for name, value in antlrGeneratedLexer.__dict__.iteritems():
    if isinstance(value, int) and name == name.upper():
        token_names[value] = name

There's no apparent logic to the numbering of tokens (at least, with Python ANTLR), and the token names are not stored as strings except in the module __dict__, so this is the only way of getting to them.

I would guess that in C# token types are in an enumeration, and I believe enumerations can be printed as strings. But that's just a guess.

×纯※雪 2024-10-13 14:06:13

天哪,我花了太多时间把头撞在墙上试图弄清楚这个问题。 Mark 的回答给了我所需的提示,看起来下面的代码将从 Antlr 4.5 中的 TerminalNode 获取令牌名称:

myLexer.getVocabulary.getSymbolicName(myTerminalNode.getSymbol.getType)

或者在 C# 中:(

myLexer.Vocabulary.GetSymbolicName(myTerminalNode.Symbol.Type)

看起来您实际上可以从解析器或词法分析器获取词汇表。)

这些词汇方法似乎是 Antlr 4.5 中获取令牌的首选方式,而 tokenNames 似乎已被弃用。

对于我认为非常基本的操作来说,它确实看起来不必要地复杂,所以也许有一种更简单的方法。

Boy, I spent way too much time banging my head against a wall trying to figure this out. Mark's answer gave me the hint I needed, and it looks like the following will get the token name from a TerminalNode in Antlr 4.5:

myLexer.getVocabulary.getSymbolicName(myTerminalNode.getSymbol.getType)

or, in C#:

myLexer.Vocabulary.GetSymbolicName(myTerminalNode.Symbol.Type)

(Looks like you can actually get the vocabulary from either the parser or the lexer.)

Those vocabulary methods seem to be the preferred way get at the tokens in Antlr 4.5, and tokenNames appears to be deprecated.

It does seem needlessly complicated for what I think is a pretty basic operation, so maybe there's an easier way.

偏闹i 2024-10-13 14:06:13

我是 Antlr 的新手,但似乎 ITree 没有直接义务与 Parser (在 .NET 中)相关。相反,有一个派生接口IParseTree,从Parser(在Antlr4中)返回,并且它包含一些附加方法,包括覆盖:

string ToStringTree(Parser parser);

它将整个节点子树转换为文本表示。对于某些情况它是有用的。如果您只想查看某个具体节点的名称而不包含其子节点,请在类 Trees 中使用静态方法:

public static string GetNodeText(ITree t, Parser recog);

此方法与 Mark基本相同罗伯特建议,但以更普遍和灵活的方式。

I'm new to Antlr, but it seems ITree has no direct obligation to be related to Parser (in .NET). Instead there is a derived interface IParseTree, returned from Parser (in Antlr4), and it contains few additional methods including override:

string ToStringTree(Parser parser);

It converts the whole node subtree into text representation. For some cases it is useful. If you like to see just the name of some concrete node without it's children, then use static method in class Trees:

public static string GetNodeText(ITree t, Parser recog);

This method does basically the same as Mark and Robert suggested, but in more general and flexible way.

乖乖公主 2024-10-13 14:06:13

除了罗伯特的Pythonic答案(希望对其他语言有用):

如果使用生成的词法分析器的 nextToken() 方法,您可以使用词法分析器的“type”属性(不是令牌,不够直观)来获取词法分析器赋予标记类型的数字代码。在词法分析器本身中,您可以看到哪种类型获得哪个数字。希望这有帮助。

In addition to robert's pythonic answer (and hopefully will be useful for other languages):

If using the nextToken() method of your generated lexer, you can use the 'type' property of the lexer (not the token, unintuitively enough) to get the numeric code given to the token type by the lexer. In the lexer itself you can see which type got which number. Hope this is helpful.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文