汇编语言:尝试理解一个小函数

发布于 2024-08-23 12:04:27 字数 545 浏览 6 评论 0原文

对于我的工作,我需要反转这部分代码 (ARM9) 正在执行的操作。我是一名 Java 开发人员我真的不明白与单个函数相关的这部分代码。

当然,我正在寻求帮助,因为原始代码不再可用。任何人都可以帮助我知道这段代码在任何高级语言中使用小算法做了什么?那就太好了。我已经尝试了很多小时没有结果。

sub_FFFF7B38
    PUSH    {LR}
    ADDS    R2, R0, #0
    LDRB    R3, [R2]
    CMP     R3, #0
    BEQ     loc_FFFF7B52
    SUBS    R1, #1
    BCC     loc_FFFF7B52

loc_FFFF7B46:
    ADDS    R0, #1
    LDRB    R3, [R0]
    CMP     R3, #0
    BEQ     loc_FFFF7B52
    SUBS    R1, #1
    BCS     loc_FFFF7B46

loc_FFFF7B52:
    SUBS    R0, R0, R2
    POP     {R1}

for my work, I need to reverse what this portion of code (ARM9) is doing. Im a java developper & I really don't understand this portion of code related to a single function.

Of course I'm asking help because the original code is not more available. Anyone can help me to know what this code is doing with a smal algorithm in any high language? It would be nice. I have tried for many hours without results.

sub_FFFF7B38
    PUSH    {LR}
    ADDS    R2, R0, #0
    LDRB    R3, [R2]
    CMP     R3, #0
    BEQ     loc_FFFF7B52
    SUBS    R1, #1
    BCC     loc_FFFF7B52

loc_FFFF7B46:
    ADDS    R0, #1
    LDRB    R3, [R0]
    CMP     R3, #0
    BEQ     loc_FFFF7B52
    SUBS    R1, #1
    BCS     loc_FFFF7B46

loc_FFFF7B52:
    SUBS    R0, R0, R2
    POP     {R1}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

撕心裂肺的伤痛 2024-08-30 12:04:27

除了最后两行之外,它可能类似于以下内容。
如果我不是100%正确,请不要打我。

如果
R0p0p 并且
R1n 并且
R2 是临时值(已编辑;首先我认为:ip0[i] 的地址)
R3 是临时值

sub_FFFF7B38
          PUSH {LR}           ; save return address
          ADDS R2, R0, #0     ; move R0 to R2
          LDRB R3, [R2]       ; load *p0
          CMP R3, #0          ; if *p0==0 
          BEQ loc_FFFF7B52    ; then jump to loc_FFFF7B52 
          SUBS R1, #1         ; decrement n
          BCC loc_FFFF7B52    ; if there was a borrow (i.e. n was 0): jump to loc_FFFF7B52


loc_FFFF7B46:
          ADDS R0, #1         ; increment p
          LDRB R3, [R0]       ; load *p
          CMP R3, #0          ; if *p==0
          BEQ loc_FFFF7B52    ; jump to loc_FFFF7B52
          SUBS R1, #1         ; decrement n
          BCS loc_FFFF7B46    ; if there was no borrow (i.e. n was not 0): jump to loc_FFFF7B46


loc_FFFF7B52:
          SUBS R0, R0, R2     ; calculate p - p0
          POP {R1}            ; ??? I don't understand the purpose of this
                              ; isn't there missing something?

或在 C:

int f(char *p0, unsigned int n)
{
  char *p;

  if (*p0==0 || n--==0)
    return 0;

  for(p=p0; *++p && n>0; n--)
  {
  }
  return p - p0;
}

Except for the last two lines, it could be something like the following.
Please don't hit me if I am not 100% correct.

If
R0 is p0 or p and
R1 is n and
R2 is temporary value (edited; first I thought: i or address of p0[i])
R3 is temporary value

.

sub_FFFF7B38
          PUSH {LR}           ; save return address
          ADDS R2, R0, #0     ; move R0 to R2
          LDRB R3, [R2]       ; load *p0
          CMP R3, #0          ; if *p0==0 
          BEQ loc_FFFF7B52    ; then jump to loc_FFFF7B52 
          SUBS R1, #1         ; decrement n
          BCC loc_FFFF7B52    ; if there was a borrow (i.e. n was 0): jump to loc_FFFF7B52


loc_FFFF7B46:
          ADDS R0, #1         ; increment p
          LDRB R3, [R0]       ; load *p
          CMP R3, #0          ; if *p==0
          BEQ loc_FFFF7B52    ; jump to loc_FFFF7B52
          SUBS R1, #1         ; decrement n
          BCS loc_FFFF7B46    ; if there was no borrow (i.e. n was not 0): jump to loc_FFFF7B46


loc_FFFF7B52:
          SUBS R0, R0, R2     ; calculate p - p0
          POP {R1}            ; ??? I don't understand the purpose of this
                              ; isn't there missing something?

or in C:

int f(char *p0, unsigned int n)
{
  char *p;

  if (*p0==0 || n--==0)
    return 0;

  for(p=p0; *++p && n>0; n--)
  {
  }
  return p - p0;
}
孤者何惧 2024-08-30 12:04:27

以下是逐行注释的指令

sub_FFFF7B38
    PUSH    {LR}          ; save LR (link register) on the stack
    ADDS    R2, R0, #0    ; R2 = R0 + 0 and set flags (could just have been MOV?)
    LDRB    R3, [R2]      ; Load R3 with a single byte from the address at R2
    CMP     R3, #0        ; Compare R3 against 0...
    BEQ     loc_FFFF7B52  ; ...branch to end if equal
    SUBS    R1, #1        ; R1 = R1 - 1 and set flags
    BCC     loc_FFFF7B52  ; branch to end if carry was clear which for subtraction is
                          ; if the result is not positive

loc_FFFF7B46:
    ADDS    R0, #1        ; R0 = R0 + 1 and set flags
    LDRB    R3, [R0]      ; Load R3 with byte from address at R0
    CMP     R3, #0        ; Compare R3 against 0...
    BEQ     loc_FFFF7B52  ; ...branch to end if equal
    SUBS    R1, #1        ; R1 = R1 - 1 and set flags
    BCS     loc_FFFF7B46  ; loop if carry set  which for subtraction is
                          ; if the result is positive

loc_FFFF7B52:
    SUBS    R0, R0, R2    ; R0 = R0 - R2
    POP     {R1}          ; Load what the previously saved value of LR into R1
                          ; Presumably the missing next line is MOV PC, R1 to
                          ; return from the function.

因此,在非常基本的 C 代码中:

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;
    char r3 = *r2;
    if (r3 == '\0')
        goto end;
    if (--r1 <= 0)
        goto end;

loop:
    r3 = *++r0;
    if (r3 == '\0')
        goto end;
    if (--r1 > 0)
        goto loop;

end:
    return r0 - r2;
}

添加一些控制结构以摆脱 goto

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;
    char r3 = *r2;

    if (r3 != '\0')
    {
        if (--r1 >= 0)
        do
        {
             if (*++r0 == '\0')
                 break;
        } while (--r1 >= 0);
    }

    return r0 - r2;
}

编辑: 现在,我对进位感到困惑位和 SUBS 已被清除,这更有意义。

简化:

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;

    while (*r0 != '\0' && --r1 >= 0)
        r0++;

    return r0 - r2;
}

换句话说,就是在r0指向的字符串指针的前r1个字符中找到第一个NUL的索引,或者返回r1 如果没有。

Here are the instructions commented line by line

sub_FFFF7B38
    PUSH    {LR}          ; save LR (link register) on the stack
    ADDS    R2, R0, #0    ; R2 = R0 + 0 and set flags (could just have been MOV?)
    LDRB    R3, [R2]      ; Load R3 with a single byte from the address at R2
    CMP     R3, #0        ; Compare R3 against 0...
    BEQ     loc_FFFF7B52  ; ...branch to end if equal
    SUBS    R1, #1        ; R1 = R1 - 1 and set flags
    BCC     loc_FFFF7B52  ; branch to end if carry was clear which for subtraction is
                          ; if the result is not positive

loc_FFFF7B46:
    ADDS    R0, #1        ; R0 = R0 + 1 and set flags
    LDRB    R3, [R0]      ; Load R3 with byte from address at R0
    CMP     R3, #0        ; Compare R3 against 0...
    BEQ     loc_FFFF7B52  ; ...branch to end if equal
    SUBS    R1, #1        ; R1 = R1 - 1 and set flags
    BCS     loc_FFFF7B46  ; loop if carry set  which for subtraction is
                          ; if the result is positive

loc_FFFF7B52:
    SUBS    R0, R0, R2    ; R0 = R0 - R2
    POP     {R1}          ; Load what the previously saved value of LR into R1
                          ; Presumably the missing next line is MOV PC, R1 to
                          ; return from the function.

So in very basic C code:

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;
    char r3 = *r2;
    if (r3 == '\0')
        goto end;
    if (--r1 <= 0)
        goto end;

loop:
    r3 = *++r0;
    if (r3 == '\0')
        goto end;
    if (--r1 > 0)
        goto loop;

end:
    return r0 - r2;
}

Adding some control structures to get rid of the gotos:

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;
    char r3 = *r2;

    if (r3 != '\0')
    {
        if (--r1 >= 0)
        do
        {
             if (*++r0 == '\0')
                 break;
        } while (--r1 >= 0);
    }

    return r0 - r2;
}

Edit: Now that my confusion about the carry bit and SUBS has been cleared up this makes more sense.

Simplifying:

void unknown(const char* r0, int r1)
{
    const char* r2 = r0;

    while (*r0 != '\0' && --r1 >= 0)
        r0++;

    return r0 - r2;
}

In words, this is find the index of the first NUL in the first r1 chars of the string pointer to by r0, or return r1 if none.

浪荡不羁 2024-08-30 12:04:27

Filip 提供了一些指针,您还需要阅读 ARM 调用约定。 (也就是说,哪个寄存器包含入口处的函数参数及其返回值。)

快速阅读后,我认为这段代码是 strnlen 或与之密切相关的代码。

Filip has provided some pointers, you also need to read up on the ARM calling convention. (That is to say, which register(s) contain the function arguments on entry and which its return value.)

From a quick reading I think this code is strnlen or something closely related to it.

深府石板幽径 2024-08-30 12:04:27

怎么样: ARM 指令集

一些提示/简化的汇编

  • Push - 将某些内容放入“堆栈”/内存
  • 添加 - 通常“添加”,如 +
  • Pop 从“堆栈”中检索某些内容/内存
  • CMP - 缺少比较,它将某些内容与其他内容进行比较。

X: 或:Whatever: 表示下面是一个“子例程”。在 Java 中使用过“goto”吗?其实也类似。

如果你有以下内容(忽略它是否正确,arm-asm它只是pseduo):

PUSH 1
x:     
    POP %eax

首先它将把1放入堆栈,然后将其弹出回eax(它是extended ax的缩写,它是一个寄存器,你可以在其中放置32 位数据量)

现在,x: 做什么呢?好吧,我们假设在此之前还有 100 行 asm,那么您可以使用“jump”指令导航到 x:

这是对asm的一点介绍。简化了。

尝试理解上面的代码并检查指令集。

How about this: Instruction set for ARM

Some hints / simplicifed asm

  • Push - Puts something on the "Stack" / Memory
  • Add - Usualy "add" as in +
  • Pop retreives something from the "stack" / Memory
  • CMP - is Short of Compare, which compares something with something else.

X: or: Whatever: means that the following is a "subroutine". Ever used "goto" in Java? Similar to that actually.

If you have the following ( ignore if it is correct arm-asm it's just pseduo ):

PUSH 1
x:     
    POP %eax

First it would put 1 on the stack and then pop it back into eax ( which is short for extended ax, which is a register where you can put 32-bit amount of data )

Now, what does the x: do then? Well let's assume that there are 100 lines of asm before that aswell, then you could use a "jump"-instruction to navigate to x:.

That's a little bit of introduction to asm. Simplified.

Try to understand the above code and examine the instruction-set.

祁梦 2024-08-30 12:04:27

我的 ASM 有点生锈了,所以请不要吃烂番茄。假设从 sub_FFFF7B38 开始:

命令 PUSH {LR} 保留链接寄存器,该寄存器是一个特殊寄存器,用于在子例程调用期间保存返回地址。

ADDS 设置标志(就像 CMN 一样)。另外,ADDS R2, R0, #0R0 添加到 0 并存储在 R2 中。 (Charles 在评论中的更正)

LDRB R3,[R2] 正在将 R2 的内容加载到主内存而不是寄存器中,由R3。 LDRB 仅加载单个字节。字中的三个未使用的字节在加载时被清零。基本上,将R2从寄存器中取出并妥善保管(也许)。

CMP R3, #0 在两个操作数之间执行减法并设置寄存器标志,但不存储结果。这些标志导致...

BEQ loc_FFFF7B521,这意味着“如果之前的比较相等,则转到 loc_FFFF7B521”或 if(R3 == 0) {goto loc_FFFF7B521;}

因此,如果 R3 不为零,则 SUBS R1, #1 命令会从 R1 中减一并设置一个标志。

如果设置了进位标志,BCC loc_FFFF7B52 将导致执行跳转到 loc_FFFF7B52

( snip )

最后,POP {LR} 恢复执行此代码之前链接寄存器中保存的先前返回地址。

编辑 - 当我在车里时,当我试图写下答案并没有时间时,凝乳阐明了我的想法。

My ASM is a bit rusty, so no rotten tomatoes please. Assuming this starts at sub_FFFF7B38:

The command PUSH {LR} preserves the link register, which is a special register which holds the return address during a subroutine call.

ADDS sets the flags (like CMN would). Also ADDS R2, R0, #0 adds R0 to 0 and stores in R2. (Correction from Charles in comments)

LDRB R3, [R2] is loading the contents of R2 into main memory instead of a register, referenced by R3. LDRB only loads a single byte. The three unused bytes in the word are zeroed upon loading. Basically, getting R2 out of the registers and in safe keeping (maybe).

CMP R3, #0 performs a subtraction between the two operands and sets the register flags, but does not store a result. Those flags lead to...

BEQ loc_FFFF7B521, which means "If the previous comparison was equal, go to loc_FFFF7B521" or if(R3 == 0) {goto loc_FFFF7B521;}

So if R3 isn't zero, then the SUBS R1, #1 command subtracts one from R1 and sets a flag.

BCC loc_FFFF7B52 will cause execution to jump to loc_FFFF7B52 if the carry flag is set.

( snip )

Finally, POP {LR} restores the previous return address that was held on the link register before this code executed.

Edit - While I was in the car, Curd spelled out just about what I was thinking when I was trying to write out my answer and ran out of time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文