使用 ANTLR 构建自己的 C# 编译器:编译单元

发布于 2024-08-01 15:51:29 字数 1305 浏览 5 评论 0原文

// Create a scanner that reads from the input stream passed to us
 CSLexer lexer = new CSLexer(new ANTLRFileStream(f));
tokens.TokenSource = lexer;

// Create a parser that reads from the scanner
CSParser parser = new CSParser(tokens);

// start parsing at the compilationUnit rule
CSParser.compilation_unit_return x = parser.compilation_unit();
object ast = x.Tree;

我可以用compilation_unit_return类型的x做什么来提取它的根、它的类、它的方法等? 我必须将其适配器取出吗? 我怎么做? 请注意,compilation_unit_return 在我的 CSParser 中是这样定义的(由 ANTLR 自动生成):

public class compilation_unit_return : ParserRuleReturnScope
    {
        private object tree;
        override public object Tree
        {
            get { return tree; }
            set { tree = (object) value; }
        }
    };

但是我得到的树是 object 类型。 我使用调试器运行,似乎看到它是 BaseTree 类型。 但BaseTree是一个接口! 我不知道它与 BaseTree 有什么关系,也不知道如何从这棵树中提取详细信息。

我需要编写一个访问者来访问它的类、方法、变量等。ParserRuleReturn 类从 RuleReturnScope 扩展,并有一个开始和停止对象,我不知道它是什么。

此外,ANTLR 提供的 TreeVisitor 类看起来很混乱。 它需要一个 Adapter 作为参数传递给它的构造函数(如果没有,它将使用默认的 CommonTreeAdaptor),这就是为什么我早先询问如何获取 Adapter 的原因。 还有其他问题。 API可以参考http://www.antlr.org/api/ CSharp/annotated.html

// Create a scanner that reads from the input stream passed to us
 CSLexer lexer = new CSLexer(new ANTLRFileStream(f));
tokens.TokenSource = lexer;

// Create a parser that reads from the scanner
CSParser parser = new CSParser(tokens);

// start parsing at the compilationUnit rule
CSParser.compilation_unit_return x = parser.compilation_unit();
object ast = x.Tree;

What can I do with the x which is of compilation_unit_return type, to extract its root, its classes, its methods etc? Do I have to extract its Adaptor out? How do I do that? Note that the compilation_unit_return is defined as such in my CSParser (which is automatically generated by ANTLR):

public class compilation_unit_return : ParserRuleReturnScope
    {
        private object tree;
        override public object Tree
        {
            get { return tree; }
            set { tree = (object) value; }
        }
    };

However the tree I am getting is of the type object. I run using the debugger and seemed to see that it is of the type BaseTree. But BaseTree is an interface! I don't know how it relates to BaseTree and don't know how to extract details out from this tree.

I need to write a visitor which has visit to its class, method, variables, etc. The ParserRuleReturn class extends from RuleReturnScope and has a start and stop object, which I don't know what it is.

Furthermore, there is this TreeVisitor class provided by ANTLR which looks confusing. It requires an Adaptor to be pass as a parameter to its constructor (if not it will use the default CommonTreeAdaptor), tt's why I asked about the how to obtain the Adaptor eariler on. And other issues too. For the API, you can refer to http://www.antlr.org/api/CSharp/annotated.html

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

薄荷梦 2024-08-08 15:51:29

我从未使用过 C# 中的 ANTLR,但是根据您的 API 链接,BaseTree 显然不是一个接口 - 它是一个 class,它具有公共属性:Type 获取节点类型,Text 获取节点类型获取(我假设)与其对应的源文本,并使用 Children 获取子节点。 走路还需要什么?

I haven't ever worked with ANTLR from C#, but following your link to API, BaseTree is clearly not an interface - it's a class, and it has public properties: Type to get type of the node, Text to get (I assume) source text corresponding to it, and Children to get the child nodes. What else do you need to walk it?

╰沐子 2024-08-08 15:51:29

如果我今天要制作一个 C# 编译器,那么我将作为第一次尝试:

  1. 从 ANTLR C# 3 目标开始(当然我在这里有偏见 - 说实话,你可以使用CSharp2 或 CSharp3 目标)。
  2. 获取带有 .NET Framework 4 的 Visual Studio 2010。这里的关键是 .NET 4,它是可爱的新表达式树。
  3. 构建一个基本的组合解析器。 在解析器中放置尽可能少的逻辑。 它应该有很少的(如果有的话)动作,并且输出应该是一个未修饰的 AST,可以使用 LL(1) walker 进行遍历。
  4. 构建树语法来遍历树并识别所有声明的类型。 它还应该保留 member_declaration 子树以供以后使用。
  5. 构建一个树遍历器,遍历单个 member_declaration 并将该成员添加到 TypeBuilder。 跟踪方法体,但不要深入遍历它们。
  6. 构建一个遍历方法体的树遍历器。 生成 表达式匹配方法,并使用CompileToMethod 方法我自己的 API(请参阅 Pavel 和我的评论)来生成 IL 代码。

如果您按此顺序执行操作,那么当您最终解析表达式(方法体、字段初始值设定项)时,您可以使用 string 参数化方法 喜欢 Expression 类以节省解析成员的工作。

If I were going to make a C# compiler today, here's what I would do try as a first attempt:

  1. Start with the ANTLR C# 3 target (of course I'm biased here - seriously you can use either the CSharp2 or CSharp3 target).
  2. Get Visual Studio 2010 with the .NET Framework 4. The key here is .NET 4 and it's sweet new expression trees.
  3. Build a basic combined parser. Put as little logic in the parser as absolutely possible. It should have few (if any) actions, and the output should be an undecorated AST that can be walked with LL(1) walker.
  4. Build a tree grammar to walk the tree and identify all declared types. It should also keep the member_declaration sub-trees for later use.
  5. Build a tree walker that walks a single member_declaration and adds the member to the TypeBuilder. Keep track of the method bodies but don't deep-walk them yet.
  6. Build a tree walker that walks the body of a method. Generate an Expression<TDelegate> matching the method, and use the CompileToMethod method my own API (see Pavel's and my comments) to generate the IL code.

If you do things in this order, then when you are finally parsing the expressions (method bodies, field initializers), you can use the string parameterized methods like this one in the Expression class to save work resolving members.

秋日私语 2024-08-08 15:51:29

您可以在文件顶部的语法选项中设置 AST 树类型,如下所示:

tree grammar CSharpTree;
options { 
    ASTLabelType = CommonTree
}

我将构建第三个语法或将其添加到现有的解析器语法中,将树转换为您创建的类。 例如,假设您有一条与加号运算符匹配的规则,并且它有 2 个参数。 您可以定义与该树匹配的规则,该规则创建您编写的类,我们将其称为 PlusExpression,如下所示:

plusExpr returns [PlusExpression value]
   : ^(PLUS left=expr right=expr) { $value = new PlusExpression($left.value, $right.value); }

expr 将是语法匹配表达式中的另一个规则。 left 和 right 只是赋予树值的别名。 { } 之间的部分几乎已逐字转换为 C# 代码,但替换了变量引用。 $left 和 $right 的 .value 属性来自创建它们的规则的指定返回值。

You can set the AST tree type in your grammar options at the top of the file like so:

tree grammar CSharpTree;
options { 
    ASTLabelType = CommonTree
}

I would build a 3rd grammar or work it into your existing parser grammar that turns the tree into classes that you create. For example assume you've got a rule that matches the plus operator and it's 2 arguments. You can define a rule matching that tree that creates a class that you've written, let's call it PlusExpression like this:

plusExpr returns [PlusExpression value]
   : ^(PLUS left=expr right=expr) { $value = new PlusExpression($left.value, $right.value); }

expr would be another rule in your grammar matching expressions. left and right are just aliases given to the tree values. The part in between the { }'s is pretty much turned into C# code verbatim with the exception of replacing the variable references. The .value property off of $left and $right comes from the return specified off of the rules that they were created from.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文