NASM X8664中简单的RET和_EXIT函数之间的差异
在过去的几天里,我一直在用X8664组装(使用MACOS上的Nasm)感到痛苦。 我想显示两个代码
,所以假设我有一个数组,我想打印它。这是我得到的代码,
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov rsi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor rbx, rbx
mov r12, rdi
mov r15, rsi
.loop:
cmp rbx, r15
jge .done
lea rdi, format
mov rsi, [r12 + rbx * 4]
xor rax, rax
call _printf
add rbx, 1
jmp .loop
.done:
leave
ret
简要说明:
- rdi中的通过阵列以及rsi
- call _printy
- create stack scack框架16Byte对printf调用,将数据保存在保存的寄存器中。
- 循环直到条件保持并离开,
- 然后退出
,这是痛苦的部分 考虑出口部分的以下代码,
call printy(char const*)
mov eax, 0
pop rbp
ret
这是编译器资源管理器生成的代码。如果我尝试复制此内容,则printf失败了。我得到的调试器踏上:
thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xe)
frame #0: 0x0000000100015501 dyld`start + 465
dyld`start:
-> 0x100015501 <+465>: mov rax, qword ptr [rbx + 0x8]
0x100015505 <+469>: mov edi, dword ptr [rax + 0x34]
0x100015508 <+472>: xor esi, esi
0x10001550a <+474>: call 0x100041a26 ; dyld3::MachOFile::isSimulatorPlatform(dyld3::Platform, dyld3::Platform*)
我个人无法理解这次崩溃背后的原因,也无法使用_EXIT功能的成功,因此这是首先有人怀疑有人可以为我清除。
我的第二个问题是关于寄存器尺寸。这个代码比第一个代码更好吗?
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov esi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor ebx, ebx
mov r12, rdi
mov r15d, esi
.loop:
cmp ebx, r15d
jge .done
lea rdi, format
mov esi, [r12 + rbx * 4]
xor rax, rax
call _printf
add ebx, 1
jmp .loop
.done:
leave
ret
如您所见,我正在加载ESI
,而不是RSI
,使用ebx
和r15d
。 我感兴趣的线是 MOV ESI,[R12 + RBX * 4]
使用ESI是否更好(我正在使用整数数组)?或使用RSI不会有很大的不同。通过使用ESI,我将其存储到RSI寄存器的较低4个字节中,而使用RSI本身,则我是消耗所有8个字节。是否有某种性能?归根结底,我猜该数字将零扩展,无论您保存4个或8个字节,RSI寄存器都无法容纳另一个整数。那类似的操作movsx rsi,dword [r12 + rbx * 4]
。比直接使用4个字节寄存器?还是非常相似 RSI,[R12 + RBX * 4]
?如果您只能清除我的一些疑问,我会感到非常高兴。感谢您的关注。
I have been in pain for the last couple of days with x8664 assembly (using nasm on macOs).
I’d like to show two pieces of code
So let’s say that I have an array and I want to print it. This is the code that I got,
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov rsi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor rbx, rbx
mov r12, rdi
mov r15, rsi
.loop:
cmp rbx, r15
jge .done
lea rdi, format
mov rsi, [r12 + rbx * 4]
xor rax, rax
call _printf
add rbx, 1
jmp .loop
.done:
leave
ret
brief explanation:
- pass array in rdi, and length in rsi
- call _printy
- create stack frame 16byte aligned for printf call, save data in the preserved registers.
- loop until condition holds and leave
- then exit
And this is the painful part
consider the following code for the exit portion
call printy(char const*)
mov eax, 0
pop rbp
ret
this is the code generated by the compiler explorer. If I try to replicate this, printf fails miserably. Stepping with the debugger I get:
thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xe)
frame #0: 0x0000000100015501 dyld`start + 465
dyld`start:
-> 0x100015501 <+465>: mov rax, qword ptr [rbx + 0x8]
0x100015505 <+469>: mov edi, dword ptr [rax + 0x34]
0x100015508 <+472>: xor esi, esi
0x10001550a <+474>: call 0x100041a26 ; dyld3::MachOFile::isSimulatorPlatform(dyld3::Platform, dyld3::Platform*)
I personally cannot understand the reason behind this crash, nor the success using the _exit function, so this is the first doubt someone could clear for me.
My second question is about register sizes. Is this code better than the first one?
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov esi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor ebx, ebx
mov r12, rdi
mov r15d, esi
.loop:
cmp ebx, r15d
jge .done
lea rdi, format
mov esi, [r12 + rbx * 4]
xor rax, rax
call _printf
add ebx, 1
jmp .loop
.done:
leave
ret
as you can see I am loading in esi
instead of rsi
, using ebx
and r15d
.
The line I am interested in ismov esi, [r12 + rbx * 4]
is it better using esi (I am working with an array of integers)?Or using rsi does not make a lot of difference.By using esi, I am storing into the lower 4 bytes of the rsi register, while using rsi itself I am consuming all 8 bytes.Is there some kind of performance hit? At the end of the day I guess the number will be zero extended and, whether you save 4 or 8 bytes, the rsi register cannot hold another integer. And what about this similar operation movsx rsi, dword[r12 + rbx * 4]
.Now I am explicit ( I am saying that I want to store 4 bytes, and I am sign extending.Is this better than using directly the 4 byte register?Or It is pretty similar torsi, [r12 + rbx * 4]
? If you can clear just some of my doubts I would be really happy. Thanks for the attention.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我会在标题中回答问题:
ret 之间的区别用于返回呼叫者,而
_Exit
是一个请求操作系统立即终止该程序,因此不会执行任何进一步的说明。何时可以使用
ret
与调用exit()
?&nbsp;您可以在任何时候调用exit()
,无论呼叫链有多深,它将终止程序,而无需给正常功能呼叫者完全执行或清理。&nbsp;如果呼叫堆栈上有更多的东西,而不仅仅是main
,但这将是一个有力的终止作为一种简单的错误处理方法(例如,打印错误消息并停止程序)。ret
可以用来返回到呼叫者,假设您已经将堆栈清除到功能输入时相同的堆栈 - 换句话说,返回地址是堆栈上的最佳用途。&nbsp; (您还必须还原呼叫保存的寄存器,否则呼叫者可能无法正常工作。)main()
通常是_start
in crt0.o 中,其中_start
是默认程序输入点。但是,
_main
(或任何其他(外部可见)符号)可以直接设置为程序入口点,给定程序构建中的正确链接器选项。程序入口点是第一个在新程序中执行的指令,并且它不是由操作系统调用的功能,而是仅转移控制,这意味着没有返回地址(在堆栈上),并且有没有参数(至少,通常不遵循常规通话惯例)。&nbsp;由于程序入口点没有被调用为真实函数,因此它也无法“返回”其呼叫者(程序输入点没有呼叫者)。&nbsp;因此,程序输入点 - 例如,如果
_start
通过main()
返回 - 必须调用exit()
才能为正确的程序终止。&nbsp ; (如果选择了替代程序入口点:它不能返回,则必须使用exit()
或其他其他方法来终止程序。)I'll answer the question in the title:
ret
is used to return to the caller, while_exit
is a request that the operating system terminate the program immediately, so it will execute no further instructions.When can you use
ret
vs. callingexit()
? You can callexit()
at any point, and no matter how deep the call chain is, and it will terminate the program without giving normal function callers a chance to execute or clean up at all. This would be a somewhat forceful termination if there are more things on the call stack than justmain
, but it is not necessarily a logic error, if that's what the program should do, and, sometimes this is used as a simple approach to error handling (e.g. print an error message and halt the program).ret
can be used to return to a caller, assuming you've cleaned up the stack to the same as it was upon function entry — in other words, that the return address is the top thing on the stack. (You also have to restore the call-preserved registers, or else the caller may not work.)main()
is typically a function called by_start
in crt0.o, where_start
is the default program entry point.However,
_main
(or any other (externally visible) symbol) can be set up directly as the program entry point, given the right linker options in the program's construction.The program entry point is the first instruction to execute in a new program, and it is not invoked as a function by the operating system but rather it is merely transferred control, which means there is no return address (on the stack) and there are no parameters (at least, usually not following the regular calling convention). Because the program entry point is not invoked as a real function, it also cannot "return" to its caller (there is no caller of the program entry point). Therefore the program entry point — e.g. if
_start
gets control back bymain()
returning — must callexit()
for proper program termination. (The same applies if an alternate program entry point is selected: it cannot return, would have to useexit()
or some other way to terminate the program.)