两个基本的 ANTLR 问题

发布于 2024-12-06 10:36:57 字数 628 浏览 0 评论 0原文

我正在尝试使用 ANTLR 来获取简单的语法并生成汇编输出。我在 ANTLR 中选择的语言是 Python。

许多教程看起来非常复杂或详细阐述与我无关的事情;我真的只需要一些非常简单的功能。所以我有两个问题:

将值从一个规则“返回”到另一规则。

因此,假设我有一个如下规则:

赋值:name=IDENTIFIER ASSIGNMENT 表达式;

当识别到此规则时,我可以在 {} 中运行 Python 代码,并且可以通过执行以下操作将参数传递给 Python 代码以进行表达式

然后

表达式[variablesList]: blah blah

但是我如何将值“返回”到我的原始规则?例如,如何计算表达式的值,然后将其发送回我的赋值规则以在 Python 中使用?

如何写出我的目标语言代码?

所以我有一些Python,它在识别规则时运行,然后我计算我想要该语句生成的程序集。但是我怎么说“将这串汇编指令写到我的目标文件中”呢?

任何与此类内容相关的优秀教程(属性语法、编译为 AST 以外的内容等)也会有所帮助。如果我的问题没有太多意义,请让我澄清;我很难理解 ANTLR。

I'm trying to use ANTLR to take a simple grammar and produce assembly output. My language of choice in ANTLR is Python.

Many tutorials seem very complicated or elaborate on things that aren't relevant to me; I only really need some very simple functionality. So I have two questions:

'Returning' values from one rule to another.

So let's say I have a rule like:

assignment: name=IDENTIFIER ASSIGNMENT expression;

I can run Python code in {}s when this rule is recognised, and I can pass args to the Python code for expression by doing something like:

assignment: name=IDENTIFIER ASSIGNMENT expression[variablesList];

and then

expression[variablesList]: blah blah

But how do I 'return' a value to my original rule? E.g. how do I calculate the value of the expression and then send it back to my assignment rule to use in Python there?

How do I write out my target language code?

So I have some Python which runs when the rules are recognised, then I calculate the assembly I want that statement to produce. But how do I say "write out this string of assembly instructions to my target file"?

Any good tutorials that are relevant to this kind of stuff (attribute grammars, compiling to something other than an AST, etc.) would be helpful too. If my questions don't make too much sense, please ask me to clarify; I'm having a hard time wrapping my head around ANTLR.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

疧_╮線 2024-12-13 10:36:57


将值从一个规则返回到另一个规则

假设您想要解析简单表达式并在运行时提供可在这些表达式中使用的变量映射。包含自定义 Python 代码、规则中的 returns 语句以及语法入口点的参数 vars 的简单语法可能如下所示:

grammar T;

options {
  language=Python;
}

@members {
  variables = {}
}

parse_with [vars] returns [value]
@init{self.variables = vars}
  :  expression EOF                            {value = $expression.value}
  ;

expression returns [value]
  :  addition                                  {value = $addition.value}
  ;

addition returns [value]
  :  e1=multiplication                         {value = $e1.value}
                       ( '+' e2=multiplication {value = value + $e2.value}
                       | '-' e2=multiplication {value = value - $e2.value}
                       )*
  ;

multiplication returns [value]
  :  e1=unary                                  {value = $e1.value}
              ( '*' e2=unary                   {value = value * $e2.value}
              | '/' e2=unary                   {value = value / $e2.value}
              )*
  ;

unary returns [value]
  :  '-' atom                                  {value = -1 * $atom.value}
  |  atom                                      {value = $atom.value}
  ;

atom returns [value]
  :  Number                                    {value = float($Number.text)}
  |  ID                                        {value = self.variables[$ID.text]}
  |  '(' expression ')'                        {value = $expression.value}
  ;

Number : '0'..'9'+ ('.' '0'..'9'+)?;
ID     : ('a'..'z' | 'A'..'Z')+;
Space  : ' ' {$channel=HIDDEN};

如果您现在生成使用 ANTLR v3.1.3 的解析器(无更高版本!):

java -cp antlr-3.1.3.jar org.antlr.Tool T.g

并运行脚本:

#!/usr/bin/env python
import antlr3
from antlr3 import *
from TLexer import *
from TParser import *

input = 'a + (1.0 + 2) * 3'
lexer = TLexer(antlr3.ANTLRStringStream(input))
parser = TParser(antlr3.CommonTokenStream(lexer))
print '{0} = {1}'.format(input, parser.parse_with({'a':42}))

您将看到打印以下输出:

a + (1.0 + 2) * 3 = 51.0

请注意,您可以定义多个“返回”类型:

parse
  :  foo              {print 'a={0} b={1} c={2}'.format($foo.a, $foo.b, $foo.c)}
  ;

foo returns [a, b, c]
  :  A B C            {a=$A.text; b=$B.text; b=$C.text}
  ;


如何写出目标语言代码

最简单的方法是只需将 print 语句放入自定义代码块中,并将输出通过管道传输到文件:

parse_with [vars]
@init{self.variables = vars}
  :  expression EOF                            {print 'OUT:', $expression.value}
  ;

然后运行如下脚本:

./run.py > out.txt

这将创建一个文件“out.txt”,其中包含: OUT: 51.0。如果你的语法不是那么高,你可能会侥幸逃脱。但是,这可能会变得有点混乱,在这种情况下,您可以将解析器的输出设置为 template:

options {
  output=template;
  language=Python;
}

并通过您自己定义的模板发出自定义代码。

请参阅:


Returning values from one rule to another

Let's say you want to parse simple expressions and provide a map of variables at runtime that can be used in these expressions. A simple grammar including the custom Python code, returns statements from the rules, and the parameter vars to the entry point of your grammar could look like this:

grammar T;

options {
  language=Python;
}

@members {
  variables = {}
}

parse_with [vars] returns [value]
@init{self.variables = vars}
  :  expression EOF                            {value = $expression.value}
  ;

expression returns [value]
  :  addition                                  {value = $addition.value}
  ;

addition returns [value]
  :  e1=multiplication                         {value = $e1.value}
                       ( '+' e2=multiplication {value = value + $e2.value}
                       | '-' e2=multiplication {value = value - $e2.value}
                       )*
  ;

multiplication returns [value]
  :  e1=unary                                  {value = $e1.value}
              ( '*' e2=unary                   {value = value * $e2.value}
              | '/' e2=unary                   {value = value / $e2.value}
              )*
  ;

unary returns [value]
  :  '-' atom                                  {value = -1 * $atom.value}
  |  atom                                      {value = $atom.value}
  ;

atom returns [value]
  :  Number                                    {value = float($Number.text)}
  |  ID                                        {value = self.variables[$ID.text]}
  |  '(' expression ')'                        {value = $expression.value}
  ;

Number : '0'..'9'+ ('.' '0'..'9'+)?;
ID     : ('a'..'z' | 'A'..'Z')+;
Space  : ' ' {$channel=HIDDEN};

If you now generate a parser using ANTLR v3.1.3 (no later version!):

java -cp antlr-3.1.3.jar org.antlr.Tool T.g

and run the script:

#!/usr/bin/env python
import antlr3
from antlr3 import *
from TLexer import *
from TParser import *

input = 'a + (1.0 + 2) * 3'
lexer = TLexer(antlr3.ANTLRStringStream(input))
parser = TParser(antlr3.CommonTokenStream(lexer))
print '{0} = {1}'.format(input, parser.parse_with({'a':42}))

you will see the following output being printed:

a + (1.0 + 2) * 3 = 51.0

Note that you can define more than a single "return" type:

parse
  :  foo              {print 'a={0} b={1} c={2}'.format($foo.a, $foo.b, $foo.c)}
  ;

foo returns [a, b, c]
  :  A B C            {a=$A.text; b=$B.text; b=$C.text}
  ;


How to write out a target language code

The easiest to go about this is to simply put print statements inside the custom code blocks and pipe the output to a file:

parse_with [vars]
@init{self.variables = vars}
  :  expression EOF                            {print 'OUT:', $expression.value}
  ;

and then run the script like this:

./run.py > out.txt

which will create a file 'out.txt' containing: OUT: 51.0. If your grammar isn't that big, you might get away with this. However, this might become a bit messy, in which case you could set the output of your parser to template:

options {
  output=template;
  language=Python;
}

and emit custom code through your own defined templates.

See:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文