LDR之后,为什么海湾合作委员会会产生额外的添加指令,以加载ARM拇指指令集上的Rodata指针?
此代码:
const char padding[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, };
const char myTable[] = { 1, 2, 3, 4 };
int keepPadding() {
return (int)(&padding);
}
int foo() {
return (int)(&myTable); // <-- this is the part I'm looking at
}
将拇指指令集的以下组件编译为以下组件(为了清楚起见,缩写)。特别注意添加
作为foo
的第二个指令:
...
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r0, .L5
@ sp needed
adds r0, r0, #10
bx lr
.L6:
.align 2
.L5:
.word .LANCHOR0
.size foo, .-foo
.align 1
.global bar
.syntax unified
.code 16
.thumb_func
.type bar, %function
...
myTable:
.ascii "\001\002\003\004"
看起来好像是在.rodata的顶部加载指针(ldr
),然后以编程方式抵消到mytable
的位置(添加
)。但是,为什么不直接加载表的地址呢?
注意:当我删除const
时,它似乎在没有添加
指令的情况下完成(带有mytable
in .data
)
问题的上下文是,我正在尝试将某些C固件进行精选,并注意到此添加了
指令,这似乎是多余的,所以我想知道是否有一种方法可以重组我摆脱它的代码。
注意:所有这些都是为ARM拇指指令集编译的,如下所示(使用Arm-None-Aebi-GCC版本11.2.1):
arm-none-eabi-gcc -Os -c -mcpu=cortex-m0 -mthumb temp.c -S
另外,请注意:此处的示例代码旨在表示较大的代码库的片段。如果mytable
是唯一汇编的东西,则它以.rodata
添加指令降落0现实情况。为了表示产生此组件的典型现实情况,我在表格之前添加了填充。
This code:
const char padding[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, };
const char myTable[] = { 1, 2, 3, 4 };
int keepPadding() {
return (int)(&padding);
}
int foo() {
return (int)(&myTable); // <-- this is the part I'm looking at
}
compiles to the following assembly for the thumb instruction set (abbreviated for clarity). Note particularly the adds
as the second instruction of foo
:
...
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r0, .L5
@ sp needed
adds r0, r0, #10
bx lr
.L6:
.align 2
.L5:
.word .LANCHOR0
.size foo, .-foo
.align 1
.global bar
.syntax unified
.code 16
.thumb_func
.type bar, %function
...
myTable:
.ascii "\001\002\003\004"
It looks like it's loading a pointer (ldr
) to the top of .rodata and then programmatically offsetting to the location of myTable
(adds
). But why not just load the address of the table itself directly?
Note: when I remove the const
then it seems to do it without the ADDS
instruction (with myTable
in .data
)
The context of the question is that I'm trying to hand-optimize some C firmware and noticed this adds
instruction that seems to be superfluous, so I'm wondering if there's a way to restructure my code to get rid of it.
Note: this is all compiled for the ARM thumb instruction set as follows (using arm-none-eabi-gcc version 11.2.1):
arm-none-eabi-gcc -Os -c -mcpu=cortex-m0 -mthumb temp.c -S
Also note: the example code here is intended to represent a snippet of a larger codebase. If myTable
were the only thing compiled then it lands at offset 0 in .rodata
and the adds
instruction disappears, but that is not the typcial case a real-world scenario. To represent the typical real-world scenario that produces this assembly, I added padding before the table.
See also here it's reproduced on Godbolt
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这个问题最初包含了这一点:
但是它没有产生添加的内容:
已经编辑了该问题以显示一个可重复的示例,结果是对此答案进行了编辑的。但是我只会将答案留给同一解决方案。由于感兴趣的是,到达锚点需要一些组件来避免被优化的问题。
因此,从您的问题和此问题中:
很明显,为什么Mytable占10的抵消。
但是填充是优化的,因此您仍然会得到相同的结果。
因此:
并且知道制作最低示例的需要。
该功能的名称意味着您已经知道所有这些,
我怀疑这是允许优化LDR。让我们尝试一下:
是的,这样就解决了,但是链接它的又不
,希望链接器能替换PC相关负载并将其转换为MOV R0,#0 ...保存负载(可能是) )对不是Cortex-M(甚至Cortex-M)的系统的优化。
注意:这也
没有使用锚,因此直接使用了mytable的地址。
从我的角度来看,“为什么”是因为使用了锚,而前面的填充物则导致与锚的偏移。因此,负载加载锚地址,然后添加使您从锚点到表。
为什么锚?为读者或其他人锻炼。
The question originally contained just this:
but it did not produce the adds:
The question has been edited to show a repeatable example, and this answer has been edited as a result. But I will just leave the answer to work toward the same solution. As maybe it is of interest that to get to the anchor took a few components to avoid the problem being optimized out.
So from your question and this:
It is obvious why myTable is at an offset of 10.
But padding is optimized out so you still end up with the same result.
So:
The name of that function implies you know all of this already and know what it took to make a minimum example, etc.
It is generating an anchor then referencing from the anchor rather than directly to the label.
I suspect it is to allow for an optimization of the ldr. Let's try:
yeah, so that fixed it, but what about linking it
Nope, was hoping that the linker would replace the pc-relative load and turn that into a mov r0,#0...Saving the load which is (might be) an optimization for systems that are not cortex-m (or even cortex-m).
Note: this also works
The anchor was not used so the address of myTable was used directly.
From my perspective the "why" is because an anchor was used and the padding in front caused myTable to be an offset from the anchor. So the load loads the anchor address then adds gets you from the anchor to the table.
Why the anchor? Exercise for the reader, or someone else.