ARM 反汇编器输出:当有两个参数而不是三个时
我正在尝试注释考试练习的反汇编块。这是我到目前为止所做的:
00000190 <mystery>:
190: 2300 movs r3, #0 // move address 190 (offset 0) into r3 ?
192: e004 b.n 19e <mystery+0xe> // if 19e then branch to mystery
194: f010 0f01 tst.w r0, #1 ; 0x1 // update flags to 1 in status register
198: bf18 it ne // if 198 not equal to ??? then ???
19a: 3301 addne r3, #1 // add to r3 if not equal to 19a offset 1?
19c: 1040 asrs r0, r0, #1 // shift r0 right one spot (leave it in r0)
19e: 2800 cmp r0, #0 // compare contents of r0 against 0 ?
1a0: d1f8 bne.n 194 <mystery+0x4> // branch to 194 if not equal to something at line 194?
1a2: 4618 mov r0, r3 // move r3 wholecloth into r0
1a4: 4770 bx lr // branch(return from the mystery function)
1a6: bf00 nop // No operation
所以我的评论非常初级,并且可能有很大的错误,但最重要的是,我真的不明白 190 或 19a 等指令的含义。只有两个参数而不是三个,那么它们是如何工作的呢?
举个例子,
19a: 3301 addne r3, #1
到目前为止我对此的解释是:如果不等于X,则将Y添加到r3? X 和 Y 是什么?我应该使用上一行的结果吗?如果是,它取代(标准三个)哪个论证?
废话!
我愿意接受我不知道自己在做什么并且完全误解了一切。
请发送帮助!
I am trying to annotate a disassembly block for exam practice. Here's what I have done so far:
00000190 <mystery>:
190: 2300 movs r3, #0 // move address 190 (offset 0) into r3 ?
192: e004 b.n 19e <mystery+0xe> // if 19e then branch to mystery
194: f010 0f01 tst.w r0, #1 ; 0x1 // update flags to 1 in status register
198: bf18 it ne // if 198 not equal to ??? then ???
19a: 3301 addne r3, #1 // add to r3 if not equal to 19a offset 1?
19c: 1040 asrs r0, r0, #1 // shift r0 right one spot (leave it in r0)
19e: 2800 cmp r0, #0 // compare contents of r0 against 0 ?
1a0: d1f8 bne.n 194 <mystery+0x4> // branch to 194 if not equal to something at line 194?
1a2: 4618 mov r0, r3 // move r3 wholecloth into r0
1a4: 4770 bx lr // branch(return from the mystery function)
1a6: bf00 nop // No operation
So my comments are pretty rudimentary and likely to be massively incorrect but most of all I really don't understand what instructions such as those at 190 or 19a mean. There are only two arguments instead of three, so how do these work?
Taking as an example
19a: 3301 addne r3, #1
My interpretation of this so far is: if not equal to X, then add Y to r3? What are X and Y? Should I be using the result from the previous line? If so, which argument (of the standard three) does it take the place of?
Blah!
I am willing to accept that I have no idea what I am doing and am completely misinterpreting everything.
Please send help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
1)TST指令与ANDS基本相同,只是它不改变第一个操作数。因此,
TST r0, #1
根据 (r0 & 1) 的结果设置标志。具体来说,如果结果为零,即 r0 的位 0 未设置,它将设置 Z(零)标志。2) IT 代表 “如果-那么”。它检查指示的条件,并有条件地执行最多 4 条以下指令。在您的示例中,您只有一条条件指令,反汇编程序会帮助提供 IT 指令中的 NE 后缀(该后缀未在 Thumb-2 的指令本身中进行编码)。 NE 的意思是“不等于”,但在这种情况下没有比较,那么给出了什么?技巧在于相等性检查会检查 Z 标志,因此您可以将其视为“非零”。因此,如果 Z 标志未设置,即 r0 did 设置了位 0,我们的 ADD 将被执行。
3) 类似的情况也发生在CMP/BNE周围。 CMP 基本上是减去操作数并根据结果设置标志。在我们的例子中,如果 r0 等于 0,它将设置 Z。接下来,BNE 将测试 Z 标志,如果未设置则分支(即 r0 不等于 0)。
将其全部转换为伪 C,我们得到:
或者,在“正常”C 中:
看起来它正在计算 r0 中的位。
查看此处的条件表代码以及它们检查的标志。 这描述了如何以及何时设置标志。
编辑:我刚刚重读了您的问题,并意识到您感到困惑的一个根源。像这样的一行:
有一个操作数,而不是两个。反汇编器试图提供帮助,不仅显示绝对目标地址 (19e),还显示其与最近符号的偏移量的表示形式(神秘位于 190,因此 19e 是神秘 + 0xe)。
您需要意识到的另一件事是,在 ARM(以及许多其他处理器)中,设置标志和使用标志通常是在单独的指令中完成的。这就是为什么您首先执行 TST 或 CMP(或其他标志设置指令),然后使用条件指令、IT 或条件分支。
1) TST instruction is basically the same as ANDS, except it doesn't change the first operand. So,
TST r0, #1
sets flags based on the result of (r0 & 1). Specifically, it will set the Z (zero) flag if the result was zero, i.e. bit 0 of r0 was not set.2) IT stands for "If-Then". It checks the condition indicated, and conditionally executes up to 4 following instructions. In your example you have only one conditional instruction, which the disassembler helpfully provided with the NE suffix from the IT instruction (the suffix is not encoded in the instruction itself for Thumb-2). NE means "not equal", but in this case there was no comparison, so what gives? The trick is that the equality check checks the Z flag, so you can think of this one as "not Zero". So, our ADD will be executed in case the Z flag was not set, i.e. r0 did have bit 0 set.
3) A similar situation happens around CMP/BNE. CMP basically subtracts operands and sets the flags based on the result. In our case, it will set Z if r0 was equal to 0. Next, BNE will test the Z flag and branch if it was not set (i.e. r0 was not equal to 0).
Converting it all to pseudo-C, we get:
Or, in "normal" C:
Looks like it's counting bits in r0.
Have a look here for the table of condition codes and what flags they check. This describes how and when the flags are set.
Edit: I just reread your question and realized one source of your confusion. In line like this:
there is one operand, not two. The disassembler tries to be helpful and shows not just the absolute destination address (19e) but also its representation as an offset from the nearest symbol (mystery is at 190, so 19e is mystery+0xe).
Another thing you need to realize is that in ARM (and many other processors), setting flags and using flags is usually done in separate instructions. That's why you first do TST or CMP (or other flag-setting instruction), and then use conditional instructions, IT, or conditional branches.
ne
后缀检查之前由movs
指令设置的状态标志。The
ne
suffix checks the status flags which were earlier set by themovs
instruction.如果您查看 ARM ARM(ARM 架构参考手册),它有一个靠近前面的关于标志的部分。与许多其他指令集不同,如果您查看 ARM 指令,就会发现 ARM 风格(不是拇指)每条指令的前四位都是条件位。与大多数其他处理器不同,使用 ARM,您可以有条件地执行任何指令,而大多数其他处理器仅允许条件分支。条件代码 ne、nz、cs、nc 等在条件代码的前面部分中列出。因此,如果零标志清除,则添加将是 addne。与大多数其他处理器不同的是,ARM(在 Arm 模式下)允许您选择何时销毁/写入标志。大多数其他人总是会更新 add 上的标志,例如,arm 仅在添加 s 时才会更新,add 不会添加。当您将条件执行和这些其他修饰符组合到指令中时,事情会变得很棘手,例如它是addsne还是addnes?这需要反复试验才能弄清楚。我猜是addnes,但我很少使用这样的组合,所以我没有记住它。
正如已经提到的,反汇编器会创建一些不可汇编的东西,输出上还有其他项目可以帮助您解码指令。
看起来您正在查看thumb2代码,它是ARM和thumb的弗兰肯斯坦混合体。因此,您将拥有一些arm功能和一些thumb功能,并且至少有binutils一些烦人的binutils-isms(不再有arm工具链可以比较)。例如,即使我们知道许多拇指指令会修改标志而不将其作为选项,并且反汇编程序通过给出添加而不是添加来显示这一点,但您不能将添加 r1,r2 用于拇指模式,因为它抱怨,它希望您使用即使您正在修改标志,也要添加 r1,r2。 ARM 正在努力推动统一的 arm/thumb 汇编语法,该语法可能已经与他们的工具链兼容,但必须看看 gnu 工具会发生什么情况。
因此,出于这两个原因,我不希望能够获取反汇编输出并重新组装该语法。额外的内容可以帮助您理解编码的具体指令。
If you look at the ARM ARM (ARM Architectural Reference Manual) it has a section closer to the front about the flags. Unlike many other instruction sets, if you look at the ARM instructions, the ARM flavor in particular (not thumb) the top four bits of every instruction are conditional bits. Unlike most other processors, with the arm you can conditionally execute any instruction, most others only allow for conditional branches. The condition codes, ne, nz, cs, nc, etc are listed in that early section on the condition codes. So an add if the zero flag is clear would be addne. Also unlike most other processors ARM (in arm mode) allows for you to choose when you want to destroy/write the flags. Most others would always update the flags on an add for example, arm only does if you add the s, add does not adds does. It gets tricky when you combine the conditional execution and these other modifiers to the instruction, for example is it addsne or addnes? that takes trial and error to figure out. I would guess addnes, but I use combinations like that so rarely that I dont have it memorized.
As already mentioned the disassembler creates something that is not assembleable, there are additional items on the output to help you decode the instruction.
It looks like you are looking at thumb2 code, which is a frankenstein mixture of ARM and thumb. So you are going to have some arm features and some thumb features and at least with binutils some annoying binutils-isms (dont have an arm toolchain anymore to compare). For example even though we know that many thumb instructions modify the flags without it being an option, and the disassembler shows this by giving adds instead of add, you cannot use adds r1,r2 for thumb mode as it complains, it wants you to use add r1,r2 even though you are modifying the flag. ARM is working to push a unified arm/thumb assembly syntax, which probably already works with their toolchain but will have to see what happens with the gnu tools.
So I wouldnt expect to be able to take the disassembly output and re-assemble that syntax for those two reasons. The extra stuff is there to help you understand the specific instruction that was encoded.