Antlr AST 产生（可能的）疯狂

发布于 2024-09-25 04:07:15 字数 370 浏览 19 评论 0原文

以下情况可能吗？我想“反转”给予 antlr 的输入，并使每个标记成为前一个标记的子标记。

因此，对于输入（假设每个标记由“.”字符分隔）：

Stack.Overflow.Horse

我希望我的语法生成以下 AST：

Horse
  |---Overflow
         |---Stack

到目前为止，我已设法反转节点，但无法创建它们彼此的孩子：

function
 : ID PERIOD function
   -> function ID
 | ID
 ;

ID  : 'a'..'z'*
    ;

原文

Is the following even possible? I want to "reverse" the input given to antlr and make each token a child of the previous one.

So, for the input (Assume each token is separated by the '.' char) :

Stack.Overflow.Horse

I would like my grammar to produce the following AST:

Horse
  |---Overflow
         |---Stack

So far, I've managed to reverse the nodes, but I'm unable to make them children of each other:

function
 : ID PERIOD function
   -> function ID
 | ID
 ;

ID  : 'a'..'z'*
    ;

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

暮凉 2024-10-02 04:07:15

我认为没有一个简单的方法可以做到这一点。您可以像这样制定规则：

function
 : ID PERIOD function
   -> ^(function ID)
 | ID
 ;

但这只会使最后一个节点成为根节点，而所有其他节点成为其子节点。例如，以下源：

a.b.c.d.e

将生成以下树：

    e
 / / \ \
d c   b a

我看不到简单的修复方法，因为当您第一次解析 abcde 时，a 将是 ID 和 bcde 对 function 的递归调用：

a.b.c.d.e
| +-----+
|    |
|    `-----> function
|
`----------> ID

导致 bcde 将具有 a作为它的孩子。当 b 成为 ID 时，它也会作为子项添加到 a 旁边。在您的情况下，应将 a 作为子项删除，然后将其添加到 b 的子项列表中。但据我所知，这在 ANLTR 中是不可能的（至少在语法中不可能以干净的方式）。

编辑

好吧，作为一种解决方法，我想到了一些优雅的东西，但这并没有像我希望的那样起作用。因此，作为一个不太优雅的解决方案，您可以将 last 节点匹配为重写规则中的根：

function
  :  (id '.')* last=id -> ^($last)
  ;

然后在 中收集所有可能的前面节点（children） >使用 += 运算符列出：

function
  :  (children+=id '.')* last=id -> ^($last)
  ;

并在解析器中使用自定义成员方法将这些子元素“注入”到树的根中（进入在您的 List 中从右到左！）：

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

一个小演示：

grammar ReverseTree;

options {
  output=AST;
}

tokens {
  ROOT;
}

@members {
  private void reverse(List nodes, CommonTree root) {
    if(nodes == null) return;
    for(int i = nodes.size()-1; i >= 0; i--) {
      CommonTree temp = (CommonTree)nodes.get(i);
      root.addChild(temp);
      root = temp;
    }
  }
}

parse
  :  function+ EOF -> ^(ROOT function+)
  ;

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

id 
  :  ID
  ;

ID
  :  ('a'..'z' | 'A'..'Z')+
  ;

Space
  :  ' ' {skip();}
  ;

和一个小测试类：

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("a.b.c.d.e    Stack.Overflow.Horse    singleNode");
        ReverseTreeLexer lexer = new ReverseTreeLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ReverseTreeParser parser = new ReverseTreeParser(tokens);
        ReverseTreeParser.parse_return returnValue = parser.parse();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

它将生成一个如下所示的 AST：

alt text

（使用 http://graph.gafol.net)

对于输入字符串：

"a.b.c.d.e    Stack.Overflow.Horse    singleNode"

I don't think there's an easy way to do that. You could make your rule like this:

function
 : ID PERIOD function
   -> ^(function ID)
 | ID
 ;

but that only makes the last node the root and all other nodes its children. For example, the following source:

a.b.c.d.e

will result in the following tree:

    e
 / / \ \
d c   b a

I can't see an easy fix since when you first parse a.b.c.d.e, a will be the ID and b.c.d.e the recursive call to function:

a.b.c.d.e
| +-----+
|    |
|    `-----> function
|
`----------> ID

resulting in the fact that b.c.d.e will have a as its child. When then b becomes the ID, it too is added as a child next to a. In your case, a should be removed as a child and then added to the list of b's children. But AFAIK, that is not possible in ANLTR (at least, not in a clean way inside the grammar).

EDIT

Okay, as a work-around I had something elegant in mind, but that didn't work as I had hoped. So, as a less elegant solution, you could match the last node as the root in your rewrite rule:

function
  :  (id '.')* last=id -> ^($last)
  ;

and then collect all possible preceding nodes (children) in a List using the += operator:

function
  :  (children+=id '.')* last=id -> ^($last)
  ;

and use a custom member-method in the parser to "inject" these children into the root of your tree (going from right to left in your List!):

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

A little demo:

grammar ReverseTree;

options {
  output=AST;
}

tokens {
  ROOT;
}

@members {
  private void reverse(List nodes, CommonTree root) {
    if(nodes == null) return;
    for(int i = nodes.size()-1; i >= 0; i--) {
      CommonTree temp = (CommonTree)nodes.get(i);
      root.addChild(temp);
      root = temp;
    }
  }
}

parse
  :  function+ EOF -> ^(ROOT function+)
  ;

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

id 
  :  ID
  ;

ID
  :  ('a'..'z' | 'A'..'Z')+
  ;

Space
  :  ' ' {skip();}
  ;

And a little test class:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("a.b.c.d.e    Stack.Overflow.Horse    singleNode");
        ReverseTreeLexer lexer = new ReverseTreeLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ReverseTreeParser parser = new ReverseTreeParser(tokens);
        ReverseTreeParser.parse_return returnValue = parser.parse();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

which will produce an AST that looks like:

alt text

(image generated using http://graph.gafol.net)

For the input string:

"a.b.c.d.e    Stack.Overflow.Horse    singleNode"

回复收藏 0 原文

~没有更多了~

关于作者

梦旅人picnic

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

Antlr AST 产生（可能的）疯狂

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

佚名

今天

゛时过境迁

达拉崩吧

呆萌少年

孤者何惧

友情链接

Antlr AST 产生（可能的）疯狂

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

佚名

今天

゛时过境迁

达拉崩吧

呆萌少年

孤者何惧

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。