ANTLR:有一个简单的例子吗?

发布于 2024-08-15 14:19:11 字数 185 浏览 8 评论 0 原文

我想开始使用 ANTLR,但在花了几个小时查看 antlr.org 上的示例之后网站上,我仍然无法清楚地了解Java的语法过程。

有没有一些简单的例子,比如用 ANTLR 实现的四操作计算器,遍历解析器定义并一直到 Java 源代码?

I'd like to get started with ANTLR, but after spending a few hours reviewing the examples at the antlr.org site, I still can't get a clear understanding of the grammar to Java process.

Is there some simple example, something like a four-operations calculator implemented with ANTLR going through the parser definition and all the way to the Java source code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

甜`诱少女 2024-08-22 14:19:11

注意:此答案适用于ANTLR3!如果您正在寻找ANTLR4示例,那么这个Q&A 演示了如何使用 ANTLR4 创建简单的表达式解析器和求值器。


您首先创建一个语法。下面是一个小语法,您可以使用它来计算使用 4 个基本数学运算符构建的表达式:+、-、* 和 /。您还可以使用括号对表达式进行分组。

请注意,此语法只是一个非常基本的语法:它不处理一元运算符(-1+9 中的减号)或像 0.99 这样的小数(没有前导数字),仅举两个缺点。这只是一个示例,您可以自己操作。

以下是语法文件 Exp.g 的内容:

grammar Exp;

/* This will be the entry point of our parser. */
eval
    :    additionExp EOF
    ;

/* Addition and subtraction have the lowest precedence. */
additionExp
    :    multiplyExp 
         ( '+' multiplyExp 
         | '-' multiplyExp
         )* 
    ;

/* Multiplication and division have a higher precedence. */
multiplyExp
    :    atomExp
         ( '*' atomExp 
         | '/' atomExp
         )* 
    ;

/* An expression atom is the smallest part of an expression: a number. Or 
   when we encounter parenthesis, we're making a recursive call back to the
   rule 'additionExp'. As you can see, an 'atomExp' has the highest precedence. */
atomExp
    :    Number
    |    '(' additionExp ')'
    ;

/* A number: can be an integer value, or a decimal value */
Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

/* We're going to ignore all white space characters */
WS  
    :   (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;}
    ;

(解析器规则以小写字母开头,词法分析器规则以大写字母开头)

创建语法后,您可以想要从中生成一个解析器和词法分析器。下载 ANTLR jar 并将其存储在与语法文件相同的目录中。

在 shell/命令提示符下执行以下命令:

java -cp antlr-3.2.jar org.antlr.Tool Exp.g

它不应产生任何错误消息,并且文件 ExpLexer.javaExpParser.javaExp.tokens 现在应该已生成。

要查看是否一切正常,请创建此测试类:

import org.antlr.runtime.*;

public class ANTLRDemo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("12*(5-6)");
        ExpLexer lexer = new ExpLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ExpParser parser = new ExpParser(tokens);
        parser.eval();
    }
}

并编译它:

// *nix/MacOS
javac -cp .:antlr-3.2.jar ANTLRDemo.java

// Windows
javac -cp .;antlr-3.2.jar ANTLRDemo.java

然后运行它:

// *nix/MacOS
java -cp .:antlr-3.2.jar ANTLRDemo

// Windows
java -cp .;antlr-3.2.jar ANTLRDemo

如果一切顺利,则不会将任何内容打印到控制台。这意味着解析器没有发现任何错误。当你将 "12*(5-6)" 更改为 "12*(5-6" ,然后重新编译并运行它,应该打印以下内容:

line 0:-1 mismatched input '<EOF>' expecting ')'

好的,现在我们想在语法中添加一些 Java 代码,以便解析器实际上执行一些有用的操作,可以通过在语法中放置 {} 来完成添加代码。其中包含一些简单的 Java 代码。

但首先:语法文件中的所有解析器规则都应该返回一个原始的 double 值,您可以通过在每个规则后添加 returns [double value] 来做到这一点:

grammar Exp;

eval returns [double value]
    :    additionExp
    ;

additionExp returns [double value]
    :    multiplyExp 
         ( '+' multiplyExp 
         | '-' multiplyExp
         )* 
    ;

// ...

这几乎不需要。解释:每个规则都应该返回一个 double 值。现在与返回值 double value 进行“交互”(它不在普通的 Java 代码块 {...} 内) code>) 从代码块内部,您需要在 value 前面添加一个美元符号:

grammar Exp;

/* This will be the entry point of our parser. */
eval returns [double value]                                                  
    :    additionExp { /* plain code block! */ System.out.println("value equals: "+$value); }
    ;
    
// ...

这是语法,但现在添加了 Java 代码:

grammar Exp;

eval returns [double value]
    :    exp=additionExp {$value = $exp.value;}
    ;

additionExp returns [double value]
    :    m1=multiplyExp       {$value =  $m1.value;} 
         ( '+' m2=multiplyExp {$value += $m2.value;} 
         | '-' m2=multiplyExp {$value -= $m2.value;}
         )* 
    ;

multiplyExp returns [double value]
    :    a1=atomExp       {$value =  $a1.value;}
         ( '*' a2=atomExp {$value *= $a2.value;} 
         | '/' a2=atomExp {$value /= $a2.value;}
         )* 
    ;

atomExp returns [double value]
    :    n=Number                {$value = Double.parseDouble($n.text);}
    |    '(' exp=additionExp ')' {$value = $exp.value;}
    ;

Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

WS  
    :   (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;}
    ;

并且由于我们的 eval code> 规则现在返回一个双精度值,将 ANTLRDemo.java 更改为:

import org.antlr.runtime.*;

public class ANTLRDemo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("12*(5-6)");
        ExpLexer lexer = new ExpLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ExpParser parser = new ExpParser(tokens);
        System.out.println(parser.eval()); // print the value
    }
}

再次(重新)从语法 (1) 生成新的词法分析器和解析器,编译所有类 (2) 并运行 ANTLRDemo (3):

// *nix/MacOS
java -cp antlr-3.2.jar org.antlr.Tool Exp.g   // 1
javac -cp .:antlr-3.2.jar ANTLRDemo.java      // 2
java -cp .:antlr-3.2.jar ANTLRDemo            // 3

// Windows
java -cp antlr-3.2.jar org.antlr.Tool Exp.g   // 1
javac -cp .;antlr-3.2.jar ANTLRDemo.java      // 2
java -cp .;antlr-3.2.jar ANTLRDemo            // 3

然后您将现在查看打印到控制台的表达式 12*(5-6) 的结果!

再次强调:这是一个非常简短的解释。我鼓励您浏览 ANTLR wiki 并阅读一些教程和/或玩一下我刚刚发布的内容。

祝你好运!

编辑:

这篇文章展示了如何扩展上面的示例,以便可以提供一个 Map 来保存所提供表达式中的变量。

为了让此代码与当前版本的 Antlr(2014 年 6 月)一起使用,我需要进行一些更改。 ANTLRStringStream 需要变为 ANTLRInputStream,返回值需要从 parser.eval() 更改为 parser.eval().value ,并且我需要删除末尾的 WS 子句,因为不再允许诸如 $channel 之类的属性值出现在词法分析器操作中。

Note: this answer is for ANTLR3! If you're looking for an ANTLR4 example, then this Q&A demonstrates how to create a simple expression parser, and evaluator using ANTLR4.


You first create a grammar. Below is a small grammar that you can use to evaluate expressions that are built using the 4 basic math operators: +, -, * and /. You can also group expressions using parenthesis.

Note that this grammar is just a very basic one: it does not handle unary operators (the minus in: -1+9) or decimals like .99 (without a leading number), to name just two shortcomings. This is just an example you can work on yourself.

Here's the contents of the grammar file Exp.g:

grammar Exp;

/* This will be the entry point of our parser. */
eval
    :    additionExp EOF
    ;

/* Addition and subtraction have the lowest precedence. */
additionExp
    :    multiplyExp 
         ( '+' multiplyExp 
         | '-' multiplyExp
         )* 
    ;

/* Multiplication and division have a higher precedence. */
multiplyExp
    :    atomExp
         ( '*' atomExp 
         | '/' atomExp
         )* 
    ;

/* An expression atom is the smallest part of an expression: a number. Or 
   when we encounter parenthesis, we're making a recursive call back to the
   rule 'additionExp'. As you can see, an 'atomExp' has the highest precedence. */
atomExp
    :    Number
    |    '(' additionExp ')'
    ;

/* A number: can be an integer value, or a decimal value */
Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

/* We're going to ignore all white space characters */
WS  
    :   (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;}
    ;

(Parser rules start with a lower case letter, and lexer rules start with a capital letter)

After creating the grammar, you'll want to generate a parser and lexer from it. Download the ANTLR jar and store it in the same directory as your grammar file.

Execute the following command on your shell/command prompt:

java -cp antlr-3.2.jar org.antlr.Tool Exp.g

It should not produce any error message, and the files ExpLexer.java, ExpParser.java and Exp.tokens should now be generated.

To see if it all works properly, create this test class:

import org.antlr.runtime.*;

public class ANTLRDemo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("12*(5-6)");
        ExpLexer lexer = new ExpLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ExpParser parser = new ExpParser(tokens);
        parser.eval();
    }
}

and compile it:

// *nix/MacOS
javac -cp .:antlr-3.2.jar ANTLRDemo.java

// Windows
javac -cp .;antlr-3.2.jar ANTLRDemo.java

and then run it:

// *nix/MacOS
java -cp .:antlr-3.2.jar ANTLRDemo

// Windows
java -cp .;antlr-3.2.jar ANTLRDemo

If all goes well, nothing is being printed to the console. This means the parser did not find any error. When you change "12*(5-6)" into "12*(5-6" and then recompile and run it, there should be printed the following:

line 0:-1 mismatched input '<EOF>' expecting ')'

Okay, now we want to add a bit of Java code to the grammar so that the parser actually does something useful. Adding code can be done by placing { and } inside your grammar with some plain Java code inside it.

But first: all parser rules in the grammar file should return a primitive double value. You can do that by adding returns [double value] after each rule:

grammar Exp;

eval returns [double value]
    :    additionExp
    ;

additionExp returns [double value]
    :    multiplyExp 
         ( '+' multiplyExp 
         | '-' multiplyExp
         )* 
    ;

// ...

which needs little explanation: every rule is expected to return a double value. Now to "interact" with the return value double value (which is NOT inside a plain Java code block {...}) from inside a code block, you'll need to add a dollar sign in front of value:

grammar Exp;

/* This will be the entry point of our parser. */
eval returns [double value]                                                  
    :    additionExp { /* plain code block! */ System.out.println("value equals: "+$value); }
    ;
    
// ...

Here's the grammar but now with the Java code added:

grammar Exp;

eval returns [double value]
    :    exp=additionExp {$value = $exp.value;}
    ;

additionExp returns [double value]
    :    m1=multiplyExp       {$value =  $m1.value;} 
         ( '+' m2=multiplyExp {$value += $m2.value;} 
         | '-' m2=multiplyExp {$value -= $m2.value;}
         )* 
    ;

multiplyExp returns [double value]
    :    a1=atomExp       {$value =  $a1.value;}
         ( '*' a2=atomExp {$value *= $a2.value;} 
         | '/' a2=atomExp {$value /= $a2.value;}
         )* 
    ;

atomExp returns [double value]
    :    n=Number                {$value = Double.parseDouble($n.text);}
    |    '(' exp=additionExp ')' {$value = $exp.value;}
    ;

Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

WS  
    :   (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;}
    ;

and since our eval rule now returns a double, change your ANTLRDemo.java into this:

import org.antlr.runtime.*;

public class ANTLRDemo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("12*(5-6)");
        ExpLexer lexer = new ExpLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ExpParser parser = new ExpParser(tokens);
        System.out.println(parser.eval()); // print the value
    }
}

Again (re) generate a fresh lexer and parser from your grammar (1), compile all classes (2) and run ANTLRDemo (3):

// *nix/MacOS
java -cp antlr-3.2.jar org.antlr.Tool Exp.g   // 1
javac -cp .:antlr-3.2.jar ANTLRDemo.java      // 2
java -cp .:antlr-3.2.jar ANTLRDemo            // 3

// Windows
java -cp antlr-3.2.jar org.antlr.Tool Exp.g   // 1
javac -cp .;antlr-3.2.jar ANTLRDemo.java      // 2
java -cp .;antlr-3.2.jar ANTLRDemo            // 3

and you'll now see the outcome of the expression 12*(5-6) printed to your console!

Again: this is a very brief explanation. I encourage you to browse the ANTLR wiki and read some tutorials and/or play a bit with what I just posted.

Good luck!

EDIT:

This post shows how to extend the example above so that a Map<String, Double> can be provided that holds variables in the provided expression.

To get this code working with a current version of Antlr (June 2014) I needed to make a few changes. ANTLRStringStream needed to become ANTLRInputStream, the returned value needed to change from parser.eval() to parser.eval().value, and I needed to remove the WS clause at the end, because attribute values such as $channel are no longer allowed to appear in lexer actions.

贱贱哒 2024-08-22 14:19:11

Gabriele Tomassetti 的ANTLR mega教程非常有帮助

它有语法示例,不同语言的访问者示例( Java、JavaScript、C# 和 Python)以及许多其他东西。强烈推荐。

编辑:Gabriele Tomassetti 关于 ANTLR 的其他有用文章

ANTLR mega tutorial by Gabriele Tomassetti is very helpful

It has grammar examples, examples of visitors in different languages (Java, JavaScript, C# and Python) and many other things. Highly recommended.

EDIT: other useful articles by Gabriele Tomassetti on ANTLR

〃安静 2024-08-22 14:19:11

对于 Antlr 4,java 代码生成过程如下: -

java -cp antlr-4.5.3-complete.jar org.antlr.v4.Tool Exp.g

相应地更新类路径中的 jar 名称。

For Antlr 4 the java code generation process is below:-

java -cp antlr-4.5.3-complete.jar org.antlr.v4.Tool Exp.g

Update your jar name in classpath accordingly.

天暗了我发光 2024-08-22 14:19:11

4.7.1 版本略有不同:
对于导入:

import org.antlr.v4.runtime.*;

对于主要部分 - 请注意 CharStreams:

CharStream in = CharStreams.fromString("12*(5-6)");
ExpLexer lexer = new ExpLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExpParser parser = new ExpParser(tokens);

version 4.7.1 was slightly different :
for import:

import org.antlr.v4.runtime.*;

for the main segment - note the CharStreams:

CharStream in = CharStreams.fromString("12*(5-6)");
ExpLexer lexer = new ExpLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExpParser parser = new ExpParser(tokens);
薄荷梦 2024-08-22 14:19:11

https://github.com/BITPlan/com.bitplan.antlr 您'您将找到一个 ANTLR java 库,其中包含一些有用的帮助器类和一些完整的示例。如果您喜欢 Eclipse 和 Maven,它已经准备好与 Maven 一起使用。

https: //github.com/BITPlan/com.bitplan.antlr/blob/master/src/main/antlr4/com/bitplan/exp/Exp.g4

是一种简单的表达式语言,可以进行乘法和加法运算。
https: //github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestExpParser.java 有相应的单元测试。

https: //github.com/BITPlan/com.bitplan.antlr/blob/master/src/main/antlr4/com/bitplan/iri/IRIParser.g4 是一个 IRI 解析器,已分为三个部分:

  1. 解析器语法 词法
  2. 分析器语法
  3. 导入的 LexBasic 语法

https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestIRIParser.java
有它的单元测试。

就我个人而言,我发现这是最棘手的部分。请参阅 http://wiki.bitplan.com/index.php/ANTLR_maven_plugin

https://github.com/ BITPlan/com.bitplan.antlr/tree/master/src/main/antlr4/com/bitplan/expr

包含另外三个示例,这些示例是针对早期版本中 ANTLR4 的性能问题而创建的。与此同时,这个问题已作为测试用例 https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestIssue994.java 显示。

At https://github.com/BITPlan/com.bitplan.antlr you'll find an ANTLR java library with some useful helper classes and a few complete examples. It's ready to be used with maven and if you like eclipse and maven.

https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/main/antlr4/com/bitplan/exp/Exp.g4

is a simple Expression language that can do multiply and add operations.
https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestExpParser.java has the corresponding unit tests for it.

https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/main/antlr4/com/bitplan/iri/IRIParser.g4 is an IRI parser that has been split into the three parts:

  1. parser grammar
  2. lexer grammar
  3. imported LexBasic grammar

https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestIRIParser.java
has the unit tests for it.

Personally I found this the most tricky part to get right. See http://wiki.bitplan.com/index.php/ANTLR_maven_plugin

https://github.com/BITPlan/com.bitplan.antlr/tree/master/src/main/antlr4/com/bitplan/expr

contains three more examples that have been created for a performance issue of ANTLR4 in an earlier version. In the meantime this issues has been fixed as the testcase https://github.com/BITPlan/com.bitplan.antlr/blob/master/src/test/java/com/bitplan/antlr/TestIssue994.java shows.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文