从三地址代码到 JVM 字节码的代码生成
我正在研究 Renjin 的字节码编译器(R 代表 JVM),并尝试将中间三地址码 (TAC) 表示形式转换为字节码。我查阅过的所有有关编译器的教科书都讨论了代码生成期间的寄存器分配,但我还没有找到任何用于在基于堆栈的虚拟机(如 JVM)上生成代码的资源。
简单的 TAC 指令很容易翻译成字节码,但当涉及临时指令时我会有点迷失。有没有人有任何描述这一点的资源指针?
这是一个完整的示例:
原始 R 代码如下所示:
x + sqrt(x * y)
TAC IR:(
0: _t2 := primitive<*>(x, y)
1: _t3 := primitive<sqrt>(_t2)
2: return primitive<+>(x, _t3)
暂时忽略我们不能总是在编译时解析对原语的函数调用的事实)
生成的 JVM 字节代码看起来(大致)类似于这:
aload_x
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn
基本上,在程序的顶部,当我到达 TAC 指令 2 时,我已经需要考虑在堆栈开头需要局部变量 x 。我可以手动思考这一点但我有很难通过算法来正确地做到这一点。有什么指点吗?
I'm working on the byte code compiler for Renjin (R for the JVM) and am experimenting with translating our intermediate three address code (TAC) representation to byte code. All the textbooks on compilers that I've consulted discuss register allocation during code generation, but I haven't been able to find any resources for code generation on stack-based virtual machines like the JVM.
Simple TAC instructions are trivial to translate into bytecode, but I get a bit lost when temporaries are involved. Does any one have any pointers to resources that describe this?
Here is a complete example:
Original R code looks like this:
x + sqrt(x * y)
TAC IR:
0: _t2 := primitive<*>(x, y)
1: _t3 := primitive<sqrt>(_t2)
2: return primitive<+>(x, _t3)
(ignore for a second the fact taht we can't always resolve function calls to primitives at compile time)
The resulting JVM byte code would look (roughly) something like this:
aload_x
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn
Basically, at the top of the program, I already need to be thinking that I'm going to need local variable x at the beginning of the stack by the time that i get to TAC instruction 2. I can think this through manually but I'm having trouble thinking through an algorithm to do this correctly. Any pointers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
将 3 地址表示转换为堆栈比将堆栈一转换为 3 地址更容易。
您的顺序应如下所示:
Transforming a 3-address representation into stack is easier than a stack one into 3-address.
Your sequence should be the following: