从三地址代码到 JVM 字节码的代码生成

发布于 2024-12-19 22:36:19 字数 848 浏览 6 评论 0原文

我正在研究 Renjin 的字节码编译器(R 代表 JVM),并尝试将中间三地址码 (TAC) 表示形式转换为字节码。我查阅过的所有有关编译器的教科书都讨论了代码生成期间的寄存器分配,但我还没有找到任何用于在基于堆栈的虚拟机(如 JVM)上生成代码的资源。

简单的 TAC 指令很容易翻译成字节码,但当涉及临时指令时我会有点迷失。有没有人有任何描述这一点的资源指针?

这是一个完整的示例:

原始 R 代码如下所示:

x + sqrt(x * y)

TAC IR:(

 0:  _t2 := primitive<*>(x, y)
 1:  _t3 := primitive<sqrt>(_t2)
 2:  return primitive<+>(x, _t3)

暂时忽略我们不能总是在编译时解析对原语的函数调用的事实)

生成的 JVM 字节代码看起来(大致)类似于这:

aload_x 
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn

基本上,在程序的顶部,当我到达 TAC 指令 2 时,我已经需要考虑在堆栈开头需要局部变量 x 。我可以手动思考这一点但我有很难通过算法来正确地做到这一点。有什么指点吗?

I'm working on the byte code compiler for Renjin (R for the JVM) and am experimenting with translating our intermediate three address code (TAC) representation to byte code. All the textbooks on compilers that I've consulted discuss register allocation during code generation, but I haven't been able to find any resources for code generation on stack-based virtual machines like the JVM.

Simple TAC instructions are trivial to translate into bytecode, but I get a bit lost when temporaries are involved. Does any one have any pointers to resources that describe this?

Here is a complete example:

Original R code looks like this:

x + sqrt(x * y)

TAC IR:

 0:  _t2 := primitive<*>(x, y)
 1:  _t3 := primitive<sqrt>(_t2)
 2:  return primitive<+>(x, _t3)

(ignore for a second the fact taht we can't always resolve function calls to primitives at compile time)

The resulting JVM byte code would look (roughly) something like this:

aload_x 
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn

Basically, at the top of the program, I already need to be thinking that I'm going to need local variable x at the beginning of the stack by the time that i get to TAC instruction 2. I can think this through manually but I'm having trouble thinking through an algorithm to do this correctly. Any pointers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

意中人 2024-12-26 22:36:19

将 3 地址表示转换为堆栈比将堆栈一转换为 3 地址更容易。

您的顺序应如下所示:

  1. 形成基本块
  2. 执行 SSA 转换
  3. 在基本块内构建表达式树
  4. 执行寄存器调度(同时进行 phi 删除),为上一步未消除的寄存器分配局部变量
  5. 发出 JVM 代码- 寄存器进入变量,表达式树被简单地扩展为堆栈操作

Transforming a 3-address representation into stack is easier than a stack one into 3-address.

Your sequence should be the following:

  1. Form basic blocks
  2. Perform an SSA-transform
  3. Build expression trees within the basic blocks
  4. Perform a register schedulling (and phi- removal simultaneously) to allocate local variables for the registers not eliminated by the previous step
  5. Emit a JVM code - registers goes into variables, expression trees are trivially expanded into stack operations
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文