链接器如何重定位 MIPS 中的分支指令?

发布于 2025-01-12 03:20:06 字数 2328 浏览 0 评论 0原文

背景

我正在开展 2015 年 CS61C(伯克利)课程项目,编写一个链接器来链接从 MIPS 指令集的以下子集生成的目标文件。

Add Unsigned: addu $rd, $rs, $rt
Or: or $rd, $rs, $rt
Set Less Than:  slt $rd, $rs, $rt
Set Less Than Unsigned: sltu $rd, $rs, $rt
Jump Register:  jr $rs
Shift Left Logical: sll $rd, $rt, shamt
Add Immediate Unsigned: addiu $rt, $rs, immediate
Or Immediate:   ori $rt, $rs, immediate
Load Upper Immediate:   lui $rt, immediate
Load Byte:  lb $rt, offset($rs)
Load Byte Unsigned: lbu $rt, offset($rs)
Load Word:  lw $rt, offset($rs)
Store Byte: sb $rt, offset($rs)
Store Word: sw $rt, offset($rs)
Branch on Equal:    beq $rs, $rt, label
Branch on Not Equal:    bne $rs, $rt, label
Jump:   j label
Jump and Link:  jal label
Load Immediate: li $rt, immediate
Branch on Less Than:    blt $rs, $rt, label

从这个指令子集中,我认为需要重定位的是 jbnebeq 指令(blt是伪指令),如果标签不存在于同一文件中,则后两者需要重新定位。

执行指令重定位的 MIPS 函数的注释如下:

#------------------------------------------------------------------------------
# function relocate_inst()
#------------------------------------------------------------------------------
# Given an instruction that needs relocation, relocates the instruction based
# on the given symbol and relocation table.
#
# You should return error if 1) the addr is not in the relocation table or
# 2) the symbol name is not in the symbol table. You may assume otherwise the 
# relocation will happen successfully.
#
# Arguments:
#  $a0 = an instruction that needs relocating
#  $a1 = the byte offset of the instruction in the current file
#  $a2 = the symbol table
#  $a3 = the relocation table
#
# Returns: the relocated instruction, or -1 if error

注意,重定位表包含相对于正在链接的目标文件的开头的地址,而符号表是所有正在链接的目标文件的符号表的集合。并包含绝对地址。

问题

  • 如果要重定位的指令是j指令,由于$a1包含该指令的相对地址,所以我们找到需要重定位的标号在重定位表,然后在符号表中找到该标签的绝对地址。我们可以添加(绝对地址>>2)作为指令的低26位。

  • 如果要重定位的指令是 bnebeq 但是,我不知道该怎么做,因为低位应该与 PC 相关+4,但是我们不知道被重定位的指令的绝对地址是多少,所以我们不知道PC+4是什么。

查看各种解决方案在线,似乎只处理j重定位。

我错过了什么吗?

编辑:我们只考虑文本段。

Background

I'm working on a 2015 CS61C (Berkeley) course project on writing a linker to link object files generated from the following subset of the MIPS instruction set.

Add Unsigned: addu $rd, $rs, $rt
Or: or $rd, $rs, $rt
Set Less Than:  slt $rd, $rs, $rt
Set Less Than Unsigned: sltu $rd, $rs, $rt
Jump Register:  jr $rs
Shift Left Logical: sll $rd, $rt, shamt
Add Immediate Unsigned: addiu $rt, $rs, immediate
Or Immediate:   ori $rt, $rs, immediate
Load Upper Immediate:   lui $rt, immediate
Load Byte:  lb $rt, offset($rs)
Load Byte Unsigned: lbu $rt, offset($rs)
Load Word:  lw $rt, offset($rs)
Store Byte: sb $rt, offset($rs)
Store Word: sw $rt, offset($rs)
Branch on Equal:    beq $rs, $rt, label
Branch on Not Equal:    bne $rs, $rt, label
Jump:   j label
Jump and Link:  jal label
Load Immediate: li $rt, immediate
Branch on Less Than:    blt $rs, $rt, label

From this subset of instructions, I think the ones that need relocation are j, bne, beq instructions (blt is a pseudo-instruction), the latter two needing to be relocated if the label is not present in the same file.

The comments of the MIPS function that does the relocation of an instruction reads

#------------------------------------------------------------------------------
# function relocate_inst()
#------------------------------------------------------------------------------
# Given an instruction that needs relocation, relocates the instruction based
# on the given symbol and relocation table.
#
# You should return error if 1) the addr is not in the relocation table or
# 2) the symbol name is not in the symbol table. You may assume otherwise the 
# relocation will happen successfully.
#
# Arguments:
#  $a0 = an instruction that needs relocating
#  $a1 = the byte offset of the instruction in the current file
#  $a2 = the symbol table
#  $a3 = the relocation table
#
# Returns: the relocated instruction, or -1 if error

Note that the relocation table contains addresses relative to the start of the object file being linked, while the symbol table is an aggregate of the symbol tables of all the object files being linked and contains absolute addresses.

Problem

  • If the instruction to be relocated is a j instruction, since $a1 contains the relative address of the instruction, we find the label that needs to be relocated in the relocation table, and then find the absolute address for that label in the symbol table. We can than add (absolute address >> 2) as the low 26 bits of the instruction.

  • If the instruction to be relocated is bne, or beq however, I am not sure what to do, since the low order bits are supposed to be relative to PC+4, but we don't know what the absolute address of the instruction being relocated is, so we don't know what PC+4 is.

Looking at various solutions online, it seems that only j relocations are handled.

Am I missing something?

EDIT: We are only considering the text segment.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寄居者 2025-01-19 03:20:06

我的猜测是这个链接器不处理外部标签的分支指令(bnebeq)。

这将阻止使用 beq label ,其中 label 是外部的(全局的并且在另一个目标文件中),但这实际上只能在汇编中实现。

例如,编译器输出将在单个函数中包含分支指令和目标位置,该函数进入单个代码块。 (模某些尾调用优化)。

有了这个限制,所有 bnebeq 指令都已经由编译器或汇编器使用 pc 相对寻址来修复——不需要在这些的重定位表。

此外,分支 (beq/bne) 指令的范围 (+/-128k) 比 j 短,因此如果链接器如果确实打算支持分支到外部标签,它可能还必须提供引入分支岛的能力来处理分支太远的分支岛。


扩展您的示例:

if ( a1 == a0 )
    printf ("hello")

有些

    bne a1, a0, endIf1
    la a0, Lhello
    jal printf
endIf1:

编译器不知道哪个函数位于哪个 DLL 中,因此,即使 printf 在 DLL 中,编译器输出仍然可能看起来相同。

My guess is that this linker does not handle branch instructions (bne or beq) to external labels.

This will preclude using beq label where label is external (global and in another object file), but this is only really possible to do in assembly.

Compiler output, for example, will have both the branch instruction and target location all within a single function, which goes into a single code chunk. (modulo certain tail call optimization).

With that limitation, then all bne and beq instructions are already fixed up by the compiler or assembler, using pc-relative addressing — there would be no need for an entry in the relocation table for these.

Further, the range of the branch (beq/bne) instructions (+/-128k) is shorter than for j, so if the linker were really intending to support branching to external label, it might also have to provide the capability to introduce branch islands to handle the ones that are branching too far away.


To expand on your example:

if ( a1 == a0 )
    printf ("hello")

would be

    bne a1, a0, endIf1
    la a0, Lhello
    jal printf
endIf1:

Some compilers don't know which function is in what DLL's so, even if printf was in a DLL, the compiler output could still look the same.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文