使用ANTLR 3.3?

发布于 2024-10-06 10:56:05 字数 322 浏览 4 评论 0原文

我正在尝试开始使用 ANTLR 和 C#,但由于缺乏文档/教程,我发现它非常困难。我发现了一些针对旧版本的半心半意的教程,但此后 API 似乎发生了一些重大变化。

谁能给我一个简单的例子来说明如何创建语法并在短程序中使用它?

我终于成功地将我的语法文件编译到词法分析器和解析器中,并且我可以在 Visual Studio 中编译并运行这些文件(在必须重新编译 ANTLR 源之后,因为 C# 二进制文件似乎也已过时!--更不用说如果没有一些修复,源代码就无法编译),但我仍然不知道如何处理我的解析器/词法分析器类。据说它可以在给定一些输入的情况下生成 AST……然后我应该能够用它做一些奇特的事情。

I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.

Can anyone give me a simple example of how to create a grammar and use it in a short program?

I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夜吻♂芭芘 2024-10-13 10:56:05

假设您想要解析由以下标记组成的简单表达式:

  • - 减法(也是一元);
  • + 加法;
  • * 乘法;
  • /除法;
  • (...) 分组(子)表达式;
  • 整数和小数。

ANTLR 语法可能如下所示:

grammar Expression;

options {
  language=CSharp2;
}

parse
  :  exp EOF 
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-') mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/') unaryExp)*
  ;

unaryExp
  :  '-' atom 
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' 
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

现在要创建正确的 AST,请在 options { ... } 部分中添加 output=AST;,然后混合一些“语法中的“树运算符”定义哪些标记应该是树的根。有两种方法可以执行此操作:

  1. 在令牌后添加 ^!^ 使令牌成为根,而 ! 将令牌从 ast 中排除;
  2. 通过使用“重写规则”: ... -> ^(根孩子孩子...)

以规则 foo 为例:

foo
  :  TokenA TokenB TokenC TokenD
  ;

假设您希望 TokenB 成为根,而 TokenATokenC 成为根成为其子级,并且您希望从树中排除 TokenD。以下是使用选项 1 执行此操作的方法:

foo
  :  TokenA TokenB^ TokenC TokenD!
  ;

以下是使用选项 2 执行此操作的方法:

foo
  :  TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)
  ;

因此,这是包含树运算符的语法:

grammar Expression;

options {
  language=CSharp2;
  output=AST;
}

tokens {
  ROOT;
  UNARY_MIN;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse
  :  exp EOF -> ^(ROOT exp)
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-')^ mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/')^ unaryExp)*
  ;

unaryExp
  :  '-' atom -> ^(UNARY_MIN atom)
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' -> exp
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

我还添加了一个 Space 规则来忽略中的任何空格源文件并为词法分析器和解析器添加了一些额外的标记和命名空间。请注意,顺序很重要(首先是 options { ... },然后是 tokens { ... },最后是 @... {}-命名空间声明)。

就是这样。

现在从语法文件生成词法分析器和解析器:

java -cp antlr-3.2.jar org.antlr.Tool Expression.g

并将 .cs 文件与 C# 运行时 DLL

您可以使用以下类对其进行测试:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Preorder(ITree Tree, int Depth) 
    {
      if(Tree == null)
      {
        return;
      }

      for (int i = 0; i < Depth; i++)
      {
        Console.Write("  ");
      }

      Console.WriteLine(Tree);

      Preorder(Tree.GetChild(0), Depth + 1);
      Preorder(Tree.GetChild(1), Depth + 1);
    }

    public static void Main (string[] args)
    {
      ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5"); 
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      ExpressionParser.parse_return ParseReturn = Parser.parse();
      CommonTree Tree = (CommonTree)ParseReturn.Tree;
      Preorder(Tree, 0);
    }
  }
}

产生以下输出:

ROOT
  *
    +
      12.5
      /
        56
        UNARY_MIN
          7
    0.5

对应于以下 AST:

alt text

(使用 graph.gafol.net 创建的图表)

请注意,ANTLR 3.3 刚刚发布CSharp 目标处于“测试阶段”。这就是我在示例中使用 ANTLR 3.2 的原因。

对于相当简单的语言(如上面的示例),您还可以即时评估结果,而无需创建 AST。您可以通过在语法文件中嵌入纯 C# 代码并让解析器规则返回特定值来实现此目的。

这是一个示例:

grammar Expression;

options {
  language=CSharp2;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse returns [double value]
  :  exp EOF {$value = $exp.value;}
  ;

exp returns [double value]
  :  addExp {$value = $addExp.value;}
  ;

addExp returns [double value]
  :  a=mulExp       {$value = $a.value;}
     ( '+' b=mulExp {$value += $b.value;}
     | '-' b=mulExp {$value -= $b.value;}
     )*
  ;

mulExp returns [double value]
  :  a=unaryExp       {$value = $a.value;}
     ( '*' b=unaryExp {$value *= $b.value;}
     | '/' b=unaryExp {$value /= $b.value;}
     )*
  ;

unaryExp returns [double value]
  :  '-' atom {$value = -1.0 * $atom.value;}
  |  atom     {$value = $atom.value;}
  ;

atom returns [double value]
  :  Number      {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}
  |  '(' exp ')' {$value = $exp.value;}
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

可以使用类进行测试:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Main (string[] args)
    {
      string expression = "(12.5 + 56 / -7) * 0.5";
      ANTLRStringStream Input = new ANTLRStringStream(expression);  
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      Console.WriteLine(expression + " = " + Parser.parse());
    }
  }
}

并产生以下输出:

(12.5 + 56 / -7) * 0.5 = 2.25

编辑

Ralph 在评论中写道:

对于使用 Visual Studio 的用户的提示:您可以输入类似 java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g"在预构建事件中,您只需修改语法并运行项目,而不必担心重建词法分析器/解析器。

Let's say you want to parse simple expressions consisting of the following tokens:

  • - subtraction (also unary);
  • + addition;
  • * multiplication;
  • / division;
  • (...) grouping (sub) expressions;
  • integer and decimal numbers.

An ANTLR grammar could look like this:

grammar Expression;

options {
  language=CSharp2;
}

parse
  :  exp EOF 
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-') mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/') unaryExp)*
  ;

unaryExp
  :  '-' atom 
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' 
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Now to create a proper AST, you add output=AST; in your options { ... } section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:

  1. add ^ and ! after your tokens. The ^ causes the token to become a root and the ! excludes the token from the ast;
  2. by using "rewrite rules": ... -> ^(Root Child Child ...).

Take the rule foo for example:

foo
  :  TokenA TokenB TokenC TokenD
  ;

and let's say you want TokenB to become the root and TokenA and TokenC to become its children, and you want to exclude TokenD from the tree. Here's how to do that using option 1:

foo
  :  TokenA TokenB^ TokenC TokenD!
  ;

and here's how to do that using option 2:

foo
  :  TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)
  ;

So, here's the grammar with the tree operators in it:

grammar Expression;

options {
  language=CSharp2;
  output=AST;
}

tokens {
  ROOT;
  UNARY_MIN;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse
  :  exp EOF -> ^(ROOT exp)
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-')^ mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/')^ unaryExp)*
  ;

unaryExp
  :  '-' atom -> ^(UNARY_MIN atom)
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' -> exp
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

I also added a Space rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... } first, then tokens { ... } and finally the @... {}-namespace declarations).

That's it.

Now generate a lexer and parser from your grammar file:

java -cp antlr-3.2.jar org.antlr.Tool Expression.g

and put the .cs files in your project together with the C# runtime DLL's.

You can test it using the following class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Preorder(ITree Tree, int Depth) 
    {
      if(Tree == null)
      {
        return;
      }

      for (int i = 0; i < Depth; i++)
      {
        Console.Write("  ");
      }

      Console.WriteLine(Tree);

      Preorder(Tree.GetChild(0), Depth + 1);
      Preorder(Tree.GetChild(1), Depth + 1);
    }

    public static void Main (string[] args)
    {
      ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5"); 
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      ExpressionParser.parse_return ParseReturn = Parser.parse();
      CommonTree Tree = (CommonTree)ParseReturn.Tree;
      Preorder(Tree, 0);
    }
  }
}

which produces the following output:

ROOT
  *
    +
      12.5
      /
        56
        UNARY_MIN
          7
    0.5

which corresponds to the following AST:

alt text

(diagram created using graph.gafol.net)

Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.

In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.

Here's an example:

grammar Expression;

options {
  language=CSharp2;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse returns [double value]
  :  exp EOF {$value = $exp.value;}
  ;

exp returns [double value]
  :  addExp {$value = $addExp.value;}
  ;

addExp returns [double value]
  :  a=mulExp       {$value = $a.value;}
     ( '+' b=mulExp {$value += $b.value;}
     | '-' b=mulExp {$value -= $b.value;}
     )*
  ;

mulExp returns [double value]
  :  a=unaryExp       {$value = $a.value;}
     ( '*' b=unaryExp {$value *= $b.value;}
     | '/' b=unaryExp {$value /= $b.value;}
     )*
  ;

unaryExp returns [double value]
  :  '-' atom {$value = -1.0 * $atom.value;}
  |  atom     {$value = $atom.value;}
  ;

atom returns [double value]
  :  Number      {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}
  |  '(' exp ')' {$value = $exp.value;}
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

which can be tested with the class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Main (string[] args)
    {
      string expression = "(12.5 + 56 / -7) * 0.5";
      ANTLRStringStream Input = new ANTLRStringStream(expression);  
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      Console.WriteLine(expression + " = " + Parser.parse());
    }
  }
}

and produces the following output:

(12.5 + 56 / -7) * 0.5 = 2.25

EDIT

In the comments, Ralph wrote:

Tip for those using Visual Studio: you can put something like java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g" in the pre-build events, then you can just modify your grammar and run the project without having to worry about rebuilding the lexer/parser.

千紇 2024-10-13 10:56:05

您看过Irony.net吗?它针对 .Net,因此工作得非常好,有适当的工具、适当的示例并且可以正常工作。唯一的问题是它仍然有点“alpha-ish”,所以文档和版本似乎有点变化,但如果你坚持使用一个版本,你可以做一些漂亮的事情。

ps 很抱歉这个错误的答案,当你问一个关于 X 的问题时,有人使用 Y 提出了不同的建议;^)

Have you looked at Irony.net? It's aimed at .Net and therefore works really well, has proper tooling, proper examples and just works. The only problem is that it is still a bit 'alpha-ish' so documentation and versions seem to change a bit, but if you just stick with a version, you can do nifty things.

p.s. sorry for the bad answer where you ask a problem about X and someone suggests something different using Y ;^)

一场信仰旅途 2024-10-13 10:56:05

我个人的经验是,在学习 C#/.NET 上的 ANTLR 之前,应该腾出足够的时间来学习 Java 上的 ANTLR。这为您提供了所有构建块的知识,稍后您可以在 C#/.NET 上应用。

我最近写了一些博客文章,

假设您熟悉 Java 上的 ANTLR,并准备好将语法文件迁移到 C#/.NET。

My personal experience is that before learning ANTLR on C#/.NET, you should spare enough time to learn ANTLR on Java. That gives you knowledge on all the building blocks and later you can apply on C#/.NET.

I wrote a few blog posts recently,

The assumption is that you are familiar with ANTLR on Java and is ready to migrate your grammar file to C#/.NET.

感情旳空白 2024-10-13 10:56:05

这里有一篇关于如何一起使用 antlr 和 C# 的精彩文章:

http://www. codeproject.com/KB/recipes/sota_expression_evaluator.aspx

这是 NCalc 创建者的一篇“如何完成”文章,NCalc 是 C# 的数学表达式求值器 - http://ncalc.codeplex.com

您还可以在此处下载 NCalc 的语法:
http://ncalc.codeplex.com/SourceControl/changeset/view /914d819f2865#Grammar%2fNCalc.g

NCalc 工作原理的示例:

Expression e = new Expression("Round(Pow(Pi, 2) + Pow([Pi2], 2) + X, 2)"); 

  e.Parameters["Pi2"] = new Expression("Pi * Pi"); 
  e.Parameters["X"] = 10; 

  e.EvaluateParameter += delegate(string name, ParameterArgs args) 
    { 
      if (name == "Pi") 
      args.Result = 3.14; 
    }; 

  Debug.Assert(117.07 == e.Evaluate()); 

希望有帮助

There is a great article on how to use antlr and C# together here:

http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx

it's a "how it was done" article by the creator of NCalc which is a mathematical expression evaluator for C# - http://ncalc.codeplex.com

You can also download the grammar for NCalc here:
http://ncalc.codeplex.com/SourceControl/changeset/view/914d819f2865#Grammar%2fNCalc.g

example of how NCalc works:

Expression e = new Expression("Round(Pow(Pi, 2) + Pow([Pi2], 2) + X, 2)"); 

  e.Parameters["Pi2"] = new Expression("Pi * Pi"); 
  e.Parameters["X"] = 10; 

  e.EvaluateParameter += delegate(string name, ParameterArgs args) 
    { 
      if (name == "Pi") 
      args.Result = 3.14; 
    }; 

  Debug.Assert(117.07 == e.Evaluate()); 

hope its helpful

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文