Antlr AST 产生(可能的)疯狂

发布于 2024-09-25 04:07:15 字数 370 浏览 9 评论 0原文

以下情况可能吗?我想“反转”给予 antlr 的输入,并使每个标记成为前一个标记的子标记。

因此,对于输入(假设每个标记由“.”字符分隔):

Stack.Overflow.Horse

我希望我的语法生成以下 AST:

Horse
  |---Overflow
         |---Stack

到目前为止,我已设法反转节点,但无法创建它们彼此的孩子:

function
 : ID PERIOD function
   -> function ID
 | ID
 ;

ID  : 'a'..'z'*
    ;

Is the following even possible? I want to "reverse" the input given to antlr and make each token a child of the previous one.

So, for the input (Assume each token is separated by the '.' char) :

Stack.Overflow.Horse

I would like my grammar to produce the following AST:

Horse
  |---Overflow
         |---Stack

So far, I've managed to reverse the nodes, but I'm unable to make them children of each other:

function
 : ID PERIOD function
   -> function ID
 | ID
 ;

ID  : 'a'..'z'*
    ;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

暮凉 2024-10-02 04:07:15

我认为没有一个简单的方法可以做到这一点。您可以像这样制定规则:

function
 : ID PERIOD function
   -> ^(function ID)
 | ID
 ;

但这只会使最后一个节点成为根节点,而所有其他节点成为其子节点。例如,以下源:

a.b.c.d.e

将生成以下树:

    e
 / / \ \
d c   b a

我看不到简单的修复方法,因为当您第一次解析 abcde 时,a 将是 IDbcdefunction 的递归调用:

a.b.c.d.e
| +-----+
|    |
|    `-----> function
|
`----------> ID

导致 bcde 将具有 a作为它的孩子。当 b 成为 ID 时,它也会作为子项添加到 a 旁边。在您的情况下,应将 a 作为子项删除,然后将其添加到 b 的子项列表中。但据我所知,这在 ANLTR 中是不可能的(至少在语法中不可能以干净的方式)。


编辑

好吧,作为一种解决方法,我想到了一些优雅的东西,但这并没有像我希望的那样起作用。因此,作为一个不太优雅的解决方案,您可以将 last 节点匹配为重写规则中的根:

function
  :  (id '.')* last=id -> ^($last)
  ;

然后在 中收集所有可能的前面节点(children) >使用 += 运算符列出

function
  :  (children+=id '.')* last=id -> ^($last)
  ;

并在解析器中使用自定义成员方法将这些子元素“注入”到树的根中(进入在您的 List 中从右到左!):

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

一个小演示:

grammar ReverseTree;

options {
  output=AST;
}

tokens {
  ROOT;
}

@members {
  private void reverse(List nodes, CommonTree root) {
    if(nodes == null) return;
    for(int i = nodes.size()-1; i >= 0; i--) {
      CommonTree temp = (CommonTree)nodes.get(i);
      root.addChild(temp);
      root = temp;
    }
  }
}

parse
  :  function+ EOF -> ^(ROOT function+)
  ;

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

id 
  :  ID
  ;

ID
  :  ('a'..'z' | 'A'..'Z')+
  ;

Space
  :  ' ' {skip();}
  ;

和一个小测试类:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("a.b.c.d.e    Stack.Overflow.Horse    singleNode");
        ReverseTreeLexer lexer = new ReverseTreeLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ReverseTreeParser parser = new ReverseTreeParser(tokens);
        ReverseTreeParser.parse_return returnValue = parser.parse();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

它将生成一个如下所示的 AST:

alt text

对于输入字符串:

"a.b.c.d.e    Stack.Overflow.Horse    singleNode"

I don't think there's an easy way to do that. You could make your rule like this:

function
 : ID PERIOD function
   -> ^(function ID)
 | ID
 ;

but that only makes the last node the root and all other nodes its children. For example, the following source:

a.b.c.d.e

will result in the following tree:

    e
 / / \ \
d c   b a

I can't see an easy fix since when you first parse a.b.c.d.e, a will be the ID and b.c.d.e the recursive call to function:

a.b.c.d.e
| +-----+
|    |
|    `-----> function
|
`----------> ID

resulting in the fact that b.c.d.e will have a as its child. When then b becomes the ID, it too is added as a child next to a. In your case, a should be removed as a child and then added to the list of b's children. But AFAIK, that is not possible in ANLTR (at least, not in a clean way inside the grammar).


EDIT

Okay, as a work-around I had something elegant in mind, but that didn't work as I had hoped. So, as a less elegant solution, you could match the last node as the root in your rewrite rule:

function
  :  (id '.')* last=id -> ^($last)
  ;

and then collect all possible preceding nodes (children) in a List using the += operator:

function
  :  (children+=id '.')* last=id -> ^($last)
  ;

and use a custom member-method in the parser to "inject" these children into the root of your tree (going from right to left in your List!):

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

A little demo:

grammar ReverseTree;

options {
  output=AST;
}

tokens {
  ROOT;
}

@members {
  private void reverse(List nodes, CommonTree root) {
    if(nodes == null) return;
    for(int i = nodes.size()-1; i >= 0; i--) {
      CommonTree temp = (CommonTree)nodes.get(i);
      root.addChild(temp);
      root = temp;
    }
  }
}

parse
  :  function+ EOF -> ^(ROOT function+)
  ;

function
  :  (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
  ;

id 
  :  ID
  ;

ID
  :  ('a'..'z' | 'A'..'Z')+
  ;

Space
  :  ' ' {skip();}
  ;

And a little test class:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("a.b.c.d.e    Stack.Overflow.Horse    singleNode");
        ReverseTreeLexer lexer = new ReverseTreeLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ReverseTreeParser parser = new ReverseTreeParser(tokens);
        ReverseTreeParser.parse_return returnValue = parser.parse();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

which will produce an AST that looks like:

alt text

For the input string:

"a.b.c.d.e    Stack.Overflow.Horse    singleNode"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文