MIPS加载地址la并不总是使用寄存器$1？

发布于 2024-12-13 19:54:52 字数 4907 浏览 2 评论 0原文

请参阅编辑部分以获取我的解释。

这有点长并且难以说明。但我很感谢您花时间阅读本文。请耐心听我说。

假设我有这样的：

.data
    str1: .asciiz "A"
    str2: .asciiz "1"
    myInt:
          .word 42      # allocate an integer word: 42
    myChar:
          .word 'Q'     # allocate a char word

    .text    
    .align 2
    .globl main

main:
    lw      $t0, myInt          # load myInt into register $t0

    lw      $t3, str1           # load str1 into register $t3

    lw      $t4, str2           #load str2 into register $t4

    la      $a0, str1           # load address str1

    la      $a1, str2           # load address str2

那么在 SPIM 中，用户文本段是

User Text Segment [00400000]..[00440000]
[00400000] 8fa40000  lw $4, 0($29)            ; 183: lw $a0 0($sp) # argc 
[00400004] 27a50004  addiu $5, $29, 4         ; 184: addiu $a1 $sp 4 # argv 
[00400008] 24a60004  addiu $6, $5, 4          ; 185: addiu $a2 $a1 4 # envp 
[0040000c] 00041080  sll $2, $4, 2            ; 186: sll $v0 $a0 2 
[00400010] 00c23021  addu $6, $6, $2          ; 187: addu $a2 $a2 $v0 
[00400014] 0c100009  jal 0x00400024 [main]    ; 188: jal main 
[00400018] 00000000  nop                      ; 189: nop 
[0040001c] 3402000a  ori $2, $0, 10           ; 191: li $v0 10 
[00400020] 0000000c  syscall                  ; 192: syscall # syscall 10 (exit) 
[00400024] 3c011001  lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
[00400028] 8c280004  lw $8, 4($1)             
[0040002c] 3c011001  lui $1, 4097             ; 25: lw $t3, str1 # load str1 into register $t3 
[00400030] 8c2b0000  lw $11, 0($1)            
[00400034] 3c011001  lui $1, 4097             ; 27: lw $t4, str2 #load str2 into register $t4 
[00400038] 8c2c0002  lw $12, 2($1)            
[0040003c] 3c041001  lui $4, 4097 [str1]      ; 29: la $a0, str1 # load address str1 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400044] 34250002  ori $5, $1, 2 [str2]

我理解 lw 是伪代码，因此需要将其分解为两条指令。我理解这部分。我们以数据段的入口地址作为“基指针”，相对访问其他数据（包括第一个数据）。

我还观察到 str1 和 str2 的加载地址使用了两个不同的寄存器：$4 和 $1。 $4 是 $a0。这是为什么？

如果我交换最后两条指令，在 SPIM 上我会看到

...        
[0040003c] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400040] 34250002  ori $5, $1, 2 [str2]     
[00400044] 3c041001  lui $4, 4097 [str1]      ; 32: la $a0, str1 # load address str1

那么为什么加载地址如此奇怪？为什么str2使用$1？？？我该如何解释 lui $1, 4097 [str2] 和 lui $4, 4097 [str1] 有何不同？

PS：有人也可以向我解释为什么我们需要括号 [str2] 吗？

lui, $1, 4097, [str2]仅将数据段的入口地址压入寄存器$1。即 0x10010000 。

非常感谢！

编辑

我重写了整个脚本以简化情况。

脚本：http://pastebin.com/BHh89iqt 文本段： http://pastebin.com/t2eDEs1f

让我提醒您，我们是用伪指令编写的，而不是真正的 MIPS 机器代码。即"lw"、"jal"、"addi"等都是伪指令。

例如，lw（加载字）被分解为两条机器指令（看文本段）：

lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
lw $8, 4($1)

MIPS是32位的，因此我们将其分解为两条指令。寻址 32 位地址的总和将产生 43 位指令集。这就是我们分为两部分的原因。标签是指向我们分配的事物的内存地址。

MIPS 只能读取 lw $rt, offset($rs) 形式的指令。所以大部分加载指令都遵循这种方式，使用$at将涉及标签的伪指令转换为MIPS机器指令。

对于lw来说这很容易。对于加载地址来说有点棘手。注意最后四个加载地址指令。 MIPS 将它们翻译成这样：

[0040003c] 3c041001  lui $4, 4097 [str1]      ; 27: la $a0, str1 # load address str2 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 28: la $a0, str2 # load address str1 
[00400044] 34240002  ori $4, $1, 2 [str2]     
[00400048] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 # load address str2 
[0040004c] 34240002  ori $4, $1, 2 [str2]     
[00400050] 3c041001  lui $4, 4097 [str1]      ; 31: la $a0, str1 # load address str1

$4 指的是 $a0。如果你看一下指令，我交换了前两条加载指令，结果是最后两条指令。我故意这样做是为了说明奇怪的行为：在交换之前，lui使用$4来存储str1的地址，但是如果我想加载str2的地址，我将使用$at，然后应用偏移量。

昨晚我想不通为什么，现在我才意识到这样做是因为编译器足够聪明，知道str1是数据段中的第一个数据，所以不需要转换任何东西。

但这也很奇怪，因为编译器如何知道在哪个字节停止打印字符串？（如果我们想打印一个字符串...）

我的猜测：空字符来终止打印。

无论如何。我想这只是 MIPS 使用的约定。

第二次编辑

事实上，如果您只是在 str1 之上添加新数据，您会看到我的解释是正确的。

新脚本：http://pastebin.com/8DuzFrk0

新文本段：http://pastebin.com/YXbvzc4z

我只将 myCharB 添加到数据段的顶部。

[0040003c] 3c011001  lui $1, 4097 [str1]      ; 29: la $a0, str1 #
load address str2
[00400040] 34240004  ori $4, $1, 4 [str1]
[00400044] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 #
load address str1
[00400048] 34240006  ori $4, $1, 6 [str2]

原文

Please refers to the edit portion for my explanation.

This is a bit long and difficult to illustrate. But I appreciate taking your time to read this. Please bear with me.

Suppose I have this:

.data
    str1: .asciiz "A"
    str2: .asciiz "1"
    myInt:
          .word 42      # allocate an integer word: 42
    myChar:
          .word 'Q'     # allocate a char word

    .text    
    .align 2
    .globl main

main:
    lw      $t0, myInt          # load myInt into register $t0

    lw      $t3, str1           # load str1 into register $t3

    lw      $t4, str2           #load str2 into register $t4

    la      $a0, str1           # load address str1

    la      $a1, str2           # load address str2

Then in SPIM, the user text segment is

User Text Segment [00400000]..[00440000]
[00400000] 8fa40000  lw $4, 0($29)            ; 183: lw $a0 0($sp) # argc 
[00400004] 27a50004  addiu $5, $29, 4         ; 184: addiu $a1 $sp 4 # argv 
[00400008] 24a60004  addiu $6, $5, 4          ; 185: addiu $a2 $a1 4 # envp 
[0040000c] 00041080  sll $2, $4, 2            ; 186: sll $v0 $a0 2 
[00400010] 00c23021  addu $6, $6, $2          ; 187: addu $a2 $a2 $v0 
[00400014] 0c100009  jal 0x00400024 [main]    ; 188: jal main 
[00400018] 00000000  nop                      ; 189: nop 
[0040001c] 3402000a  ori $2, $0, 10           ; 191: li $v0 10 
[00400020] 0000000c  syscall                  ; 192: syscall # syscall 10 (exit) 
[00400024] 3c011001  lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
[00400028] 8c280004  lw $8, 4($1)             
[0040002c] 3c011001  lui $1, 4097             ; 25: lw $t3, str1 # load str1 into register $t3 
[00400030] 8c2b0000  lw $11, 0($1)            
[00400034] 3c011001  lui $1, 4097             ; 27: lw $t4, str2 #load str2 into register $t4 
[00400038] 8c2c0002  lw $12, 2($1)            
[0040003c] 3c041001  lui $4, 4097 [str1]      ; 29: la $a0, str1 # load address str1 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400044] 34250002  ori $5, $1, 2 [str2]

I understand that lw is a pseudocode so it needs to be broken down into two instructions. I understand this part. We use the entry address of data segment as a "base pointer" and relatively accessing other data (including the first data).

I also observed that loading address of str1 and str2 used two different registers: $4 and $1. $4 is $a0.
Why is that?

If I swap the last two instructions, on SPIM I see

...        
[0040003c] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400040] 34250002  ori $5, $1, 2 [str2]     
[00400044] 3c041001  lui $4, 4097 [str1]      ; 32: la $a0, str1 # load address str1

So why is load address so strange? Why did str2 use $1 ???
How can I go about explaining how lui $1, 4097 [str2] and lui $4, 4097 [str1] are different?

PS: Can someone also explain to me why we need the bracket [str2] ?

lui, $1, 4097, [str2] only pushes the entry address of data segment to register $1. That is, 0x10010000 .

Thank you very much!

EDIT

I rewrote the entire script to simplify the situation.

Script: http://pastebin.com/BHh89iqt
Text Segment: http://pastebin.com/t2eDEs1f

Let me remind you that we write in pseudo instructions, rather than true MIPS machine code. That is, "lw", "jal", "addi", etc are all pseudo instructions.

For example, lw (load word) is broken down into two machine instructions (look at the text segement):

lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
lw $8, 4($1)

MIPS is 32-bit, we therefore break it down into two instructions. The total of addressing a 32-bit address will result in 43 bits instruction set.. this is why we break down into 2 parts.
A label is a memory address pointing at the thing we assigned to.

MIPS can only read instructions of the form lw $rt, offset($rs). So most of the load instructions follow this approach and use $at to convert pseudoinstructions that involve labels to MIPS machine instructions.

For lw it's pretty easy. For la load address it's a bit tricky.
Pay attention to the last four load address instructions. MIPS translates them into this:

[0040003c] 3c041001  lui $4, 4097 [str1]      ; 27: la $a0, str1 # load address str2 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 28: la $a0, str2 # load address str1 
[00400044] 34240002  ori $4, $1, 2 [str2]     
[00400048] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 # load address str2 
[0040004c] 34240002  ori $4, $1, 2 [str2]     
[00400050] 3c041001  lui $4, 4097 [str1]      ; 31: la $a0, str1 # load address str1

$4 refers to $a0. If you look at the instructions, I swapped the first two load instructions and the result is the last two instructions.
I purposely did this to illustrate the strange behavior: before swapping, lui uses $4 to store the address of str1, but if I want to load the address of str2, I will use $at and then apply offset.

I couldn't figure out why last night, and just now, I realized that this is done because the compiler is smart enough to know that str1 is the first data in the data segement, so there is no need to convert anything.

Yet this is also strange because how does the compiler know at what byte to stop printing the string? (if we want to print a string...)

My guess: Null character to terminate print.

Anyhow. I guess this is just a convention that the MIPS uses.

Second Edit

In fact if you just add a new data on top of str1, you will see that
my explanation is correct.

New script: http://pastebin.com/8DuzFrk0

New Text Segment: http://pastebin.com/YXbvzc4z

I only added myCharB to the top of the data segment.

[0040003c] 3c011001  lui $1, 4097 [str1]      ; 29: la $a0, str1 #
load address str2
[00400040] 34240004  ori $4, $1, 4 [str1]
[00400044] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 #
load address str1
[00400048] 34240006  ori $4, $1, 6 [str2]

分享到QQ

分享到微博