ldr 的奇怪行为 [pc, #value]

发布于 2024-08-18 17:03:06 字数 703 浏览 2 评论 0原文

我正在调试一些c++代码(ARM平台上的WinCE 6), 我发现一些行为很奇怪:

    4277220C    mov         r3, #0x93, 30
    42772210    str         r3, [sp]
    42772214    ldr         r3, [pc, #0x69C]
    42772218    ldr         r2, [pc, #0x694]
    4277221C    mov         r1, #0
    42772220    ldr         r0, [pc, #0x688]

42772214 ldr r3, [pc, #0x69C] 用于从 .DATA 部分获取一些常量,至少我这么认为。

奇怪的是,根据代码,r2 应该填充来自地址 pc=0x42772214 + 0x69C = 0x427728B0 的内存,但根据从 0x427728B8(8 个字节+)加载的内存内容,其他 ldr 用法也会发生这种情况。

是调试器的错误还是我对 ldr/pc 的理解错误? 我不明白的另一个问题 - 为什么对 .data 部分的访问与执行的代码相关?我觉得有点奇怪。

还有一个问题:我找不到第一个 mov 命令的语法(任何人都可以向我指出 Thumb (1C2) 的 optype 规范)

抱歉,我的描述很简单,但我只是熟悉程序集。

I was debugging some c++ code (WinCE 6 on ARM platform),
and i find some behavior strange:

    4277220C    mov         r3, #0x93, 30
    42772210    str         r3, [sp]
    42772214    ldr         r3, [pc, #0x69C]
    42772218    ldr         r2, [pc, #0x694]
    4277221C    mov         r1, #0
    42772220    ldr         r0, [pc, #0x688]

Line 42772214 ldr r3, [pc, #0x69C] is used to get some constant from .DATA section, at least I think so.

What is strange that according to the code r2 should be filled with memory from address pc=0x42772214 + 0x69C = 0x427728B0, but according to the memory contents it's loaded from 0x427728B8 (8bytes+), it happens for other ldr usages too.

Is it fault of the debugger or my understanding of ldr/pc?
Another issue I don't get - why access to the .data section is relative to the executed code? I find it little bit strange.

And one more issue: i cannot find syntax of the 1st mov command (any one could point me a optype specification for the Thumb (1C2))

Sorry for the laic description, but I'm just familiarizing with the assemblies.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

心是晴朗的。 2024-08-25 17:03:06

这是正确的。当使用pc读取时,ARM模式下有8字节偏移,Thumb模式下有4字节偏移。

来自ARM-ARM:

当一条指令读取PC时,读取的值取决于它来自哪个指令集:

  • 对于 ARM 指令,读取的值是指令地址加上 8 个字节。该值的位 [1:0] 始终为零,因为 ARM 指令始终是字对齐的。
  • 对于 Thumb 指令,读取的值是指令地址加上 4 个字节。该值的位 [0] 始终为零,因为 Thumb 指令始终半字对齐。

这种读取 PC 的方式主要用于对附近指令和数据进行快速、与位置无关的寻址,包括程序中与位置无关的分支。

使用 PC 相对寻址有两个原因。

  1. 与位置无关的代码,适合您的情况。
  2. 获取附近一些复杂的常量,这些常量无法用 1 条简单的指令来写,例如 mov r3, #0x12345678 是不可能用 1 条指令完成的,因此编译器可能会将这个常量放在函数的末尾并使用例如 ldr r3, [pc, #0x50] 来加载它。

我不知道 mov r3, #0x93, 30 是什么意思。可能是 mov r3, #0x93, rol 30 (给出 0xC0000024)?

This is correct. When pc is used for reading there is an 8-byte offset in ARM mode and 4-byte offset in Thumb mode.

From the ARM-ARM:

When an instruction reads the PC, the value read depends on which instruction set it comes from:

  • For an ARM instruction, the value read is the address of the instruction plus 8 bytes. Bits [1:0] of this value are always zero, because ARM instructions are always word-aligned.
  • For a Thumb instruction, the value read is the address of the instruction plus 4 bytes. Bit [0] of this value is always zero, because Thumb instructions are always halfword-aligned.

This way of reading the PC is primarily used for quick, position-independent addressing of nearby instructions and data, including position-independent branching within a program.

There are 2 reasons for pc-relative addressing.

  1. Position-independent code, which is in your case.
  2. Get some complicated constants nearby which cannot be written in 1 simple instruction, e.g. mov r3, #0x12345678 is impossible to complete in 1 instruction, so the compiler may put this constant in the end of the function and use e.g. ldr r3, [pc, #0x50] to load it instead.

I don't know what mov r3, #0x93, 30 means. Probably it is mov r3, #0x93, rol 30 (which gives 0xC0000024)?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文