汇编中的(进位标志)和系统调用(Mac OS 上的 x64 Intel 语法)之间有什么关系?
我是汇编语言的新手,我必须实现 在MAC
中使用汇编语言x64
读取函数。 到目前为止,这就是我所做的:
;;;;;;ft_read.s;;;;;;
global _ft_read:
section .text
extern ___error
_ft_read:
mov rax, 0x2000003 ; store syscall value of read on rax
syscall ; call read and pass to it rdi , rsi, rdx ==> rax read(rdi, rsi, rdx)
cmp rax, 103 ; compare rax with 103 by subtracting 103 from rax ==> rax - 103
jl _ft_read_error ; if the result of cmp is less than 0 then jump to _ft_read_error
ret ; else return the rax value which is btw the return value of syscall
_ft_read_error:
push rax
call ___error
pop rcx
mov [rax], rcx
mov rax, -1
ret
如您在上面看到的,我使用 syscall 调用 read,然后将存储在 rax 中的 read syscall 的返回值与 103
进行比较,我将解释为什么比较它与 103
但在此之前,让我解释一下其他事情,即 errno (mac 的手册页),这是手册页中关于 errno 的内容:
当系统调用检测到错误时,它返回一个整数值,指示 ing 失败(通常为 -1)并相应地设置变量 errno。 <这个 允许在收到 -1 时解释失败并采取行动 相应地。>成功的调用永远不会设置 errno;一旦设置,它仍然存在 直到另一个错误发生。仅应在出现错误后进行检查。 请注意,许多系统调用会重载这些错误的含义 数字,并且必须根据类型解释含义 以及通话的情况。
以下是给定的错误及其名称的完整列表 在
中。
0
错误 0。未使用。
1
EPERM 操作不允许。尝试执行仅限于具有适当权限的进程或仅限于 文件或其他资源的所有者。
2
ENOENT 没有这样的文件或目录。指定路径名的组成部分不存在,或者路径名是空字符串。...................................................... ...我将跳过这一部分(顺便说一句,我写了这行)...................................... ................
101
ETIME STREAM ioctl() 超时。此错误保留供将来使用。
102
EOPNOTSUPP 套接字不支持操作。所尝试的操作不支持所引用的套接字类型;例如,尝试接受数据报套接字上的连接。
据我了解,在使用 lldb 调试了很多时间后,我注意到 syscall 返回了 errno 中显示的数字之一> 手册页,例如,当我传递错误的文件描述符时,在我的 ft_read
函数中使用下面的 main.c
代码,如下所示:
int bad_file_des = -1337;// a file descriptor which it doesn't exist of course, you can change it with -42 as you like
ft_read(bad_file_des, buff, 300);
我们的 syscall
返回9
,它存储在rax
所以我比较 rax
rax
rax
rax
rax
103(因为 errno 值从 0 到 102)然后跳转到 ft_read_error ,因为这是它应该做的。
一切都按我的计划进行,但是当我打开一个现有文件并将其文件描述符传递给我的 ft_read
函数时,出现了一个不知从何而来的问题,如下面的 main.c 所示
,我们的 readsyscall
返回“返回读取的字节数”
,这就是 read
syscall 返回的内容,如手动的:
成功时,返回读取的字节数(零表示结束 文件的位置),并且文件位置提前该数字。这是 如果该数字小于字节数则不会出错 要求;例如,这可能会发生,因为字节数较少 现在实际上可用(也许是因为我们已经接近尾声了 文件,或者因为我们正在从管道或终端读取),或者 因为 read() 被信号中断了。另请参阅注释。
出错时,返回 -1,并适当设置 errno。在这个 情况下,未指定文件位置(如果有) 变化。
在我的 main 中,它工作得很好,我向我的 ft_read 函数传递了一个好的文件描述符、一个用于存储数据的缓冲区以及要读取的 50 个字节,因此 syscall将返回存储在rax
中的50
,然后比较使其工作>>> rax = 50
< 103 那么即使没有错误,它也会跳转到 ft_read_error ,只是因为 50
是这些 errno
错误号之一,而该错误号不在其中案件。
有人建议使用 jc
(如果设置了进位标志则跳转)而不是 jl
(如果少则跳转),如下面的代码所示:
;;;;;;ft_read.s;;;;;;
global _ft_read:
section .text
extern ___error
_ft_read:
mov rax, 0x2000003 ; store syscall value of read on rax
syscall ; call read and pass to it rdi , rsi, rdx ==> rax read(rdi, rsi, rdx)
; deleted the cmp
jc _ft_read_error ; if carry flag is set then jump to _ft_read_error
ret ; else return the rax value which is btw the return value of syscall
_ft_read_error:
push rax
call ___error
pop rcx
mov [rax], rcx
mov rax, -1
ret
猜猜看,它工作得很好并且当没有错误时,errno
使用我的 ft_read
返回 0
,当有错误时,它返回相应的错误号。
但问题是我不知道为什么设置了进位标志
,当没有cmp
时,系统调用是否设置了进位标志
当调用过程中出现错误,或者后台发生其他事情时?我想要详细解释系统调用和进位标志之间的关系,我对汇编还是新手,非常想学习它,提前感谢。
syscall
和进位标志
之间的关系是什么以及syscall
如何设置它?
这是我的main.c
函数,我用它来编译上面的汇编代码:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
ssize_t ft_read(int fildes, void *buf, size_t nbyte);
int main()
{
/*-----------------------------------------------------------------------*/
///////////////////////////////////////////////////////////////////////////
/********************************ft_read**********************************/
int fd = open("./main.c", O_RDONLY);
char *buff = calloc(sizeof(char), 50 + 1);
int ret = ft_read(fd, buff, 50);
printf("ret value = %d, error value = %d : %s\n", ret, errno, strerror(errno));
//don't forget to free ur buffer bro, this is just a test main don't be like me.
return (0);
}
I am new to the assembly language, and I have to make an implementation of read function using assembly language x64
in MAC
.
so far this is what I did :
;;;;;;ft_read.s;;;;;;
global _ft_read:
section .text
extern ___error
_ft_read:
mov rax, 0x2000003 ; store syscall value of read on rax
syscall ; call read and pass to it rdi , rsi, rdx ==> rax read(rdi, rsi, rdx)
cmp rax, 103 ; compare rax with 103 by subtracting 103 from rax ==> rax - 103
jl _ft_read_error ; if the result of cmp is less than 0 then jump to _ft_read_error
ret ; else return the rax value which is btw the return value of syscall
_ft_read_error:
push rax
call ___error
pop rcx
mov [rax], rcx
mov rax, -1
ret
as you can see above, I call read with syscall, and then I compare the returned value of read syscall that stored in rax with 103
, I will explain why I compare it with 103
but before that, let me explain something else, which is errno (man page of mac), this is what is written in the manual page about errno
:
When a system call detects an error, it returns an integer value indicat-ing indicating
ing failure (usually -1) and sets the variable errno accordingly. <This
allows interpretation of the failure on receiving a -1 and to take action
accordingly.> Successful calls never set errno; once set, it remains
until another error occurs. It should only be examined after an error.
Note that a number of system calls overload the meanings of these error
numbers, and that the meanings must be interpreted according to the type
and circumstances of the call.The following is a complete list of the errors and their names as given
in <sys/errno.h>.
0
Error 0. Not used.
1
EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the
owner of a file or other resources.
2
ENOENT No such file or directory. A component of a specified pathname did not exist, or the pathname was an empty string...................................................I'll skip this part (I wrote this line btw)..................................................
101
ETIME STREAM ioctl() timeout. This error is reserved for future use.
102
EOPNOTSUPP Operation not supported on socket. The attempted operation is not supported for the type of socket referenced; for example, trying to accept a connection on a datagram socket.
and as I understand and after I debugged a lot of time using lldb
, I noticed that syscall
returns one of those numbers that are shown in the errno
man page, for example when I pass a bad file descriptor, in my ft_read
function using the below main.c
code like this :
int bad_file_des = -1337;// a file descriptor which it doesn't exist of course, you can change it with -42 as you like
ft_read(bad_file_des, buff, 300);
our syscall
returns 9
which is stored in rax
so I compare if rax
< 103 (because errno values are from 0 to 102) then jump to ft_read_error
because that's what it should do.
Well everything works as I planned but there is a problem which came from nowhere, when I open an existing file and I pass it's file descriptor to my ft_read
function as shown in the below main.c
, our read syscall
returns "the number of bytes read is returned"
, this is what read
syscall returns as described on the manual:
On success, the number of bytes read is returned (zero indicates end
of file), and the file position is advanced by this number. It is
not an error if this number is smaller than the number of bytes
requested; this may happen for example because fewer bytes are
actually available right now (maybe because we were close to end-of-
file, or because we are reading from a pipe, or from a terminal), or
because read() was interrupted by a signal. See also NOTES.On error, -1 is returned, and errno is set appropriately. In this
case, it is left unspecified whether the file position (if any)
changes.
and in my main that it works pretty fine, I pass to my ft_read
function a good file descriptor, a buffer to store the data, and 50 bytes to read, so syscall
will return 50
that stored in rax
, then the comparison makes it's job >> rax = 50
< 103 then it will jump to ft_read_error
even that there is no error, just because 50
is one of those errno
error numbers which is not in this case.
someone suggests to use jc
(jump if carry flag is set) rather than jl
(jump if less) as shown in the code below :
;;;;;;ft_read.s;;;;;;
global _ft_read:
section .text
extern ___error
_ft_read:
mov rax, 0x2000003 ; store syscall value of read on rax
syscall ; call read and pass to it rdi , rsi, rdx ==> rax read(rdi, rsi, rdx)
; deleted the cmp
jc _ft_read_error ; if carry flag is set then jump to _ft_read_error
ret ; else return the rax value which is btw the return value of syscall
_ft_read_error:
push rax
call ___error
pop rcx
mov [rax], rcx
mov rax, -1
ret
and guess what, it works perfectly and errno
returns 0
using my ft_read
when there is no error, and it returns the appropriate error number when there is an error.
but the problem is that I don't know why the carry flag
got set, when there is no cmp
, does syscall set the carry flag
when there is an error during the call, or there is another thing happening in the background? I want a detailed explanation about the relation between the syscall and carry flag
, I am still new to assembly and I want to learn it so badly, and thanks in advance.
what is the relation between the syscall
and carry flag
and how syscall
sets it?
this is my main.c
function that I use to compile the assembly code above :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
ssize_t ft_read(int fildes, void *buf, size_t nbyte);
int main()
{
/*-----------------------------------------------------------------------*/
///////////////////////////////////////////////////////////////////////////
/********************************ft_read**********************************/
int fd = open("./main.c", O_RDONLY);
char *buff = calloc(sizeof(char), 50 + 1);
int ret = ft_read(fd, buff, 50);
printf("ret value = %d, error value = %d : %s\n", ret, errno, strerror(errno));
//don't forget to free ur buffer bro, this is just a test main don't be like me.
return (0);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
造成混淆的部分原因是术语“系统调用”用于两个完全不同的事物:
通过执行
syscall
调用内核从文件中读取的实际请求说明。C 函数
read()
,由用户空间 C 库提供,作为 C 程序方便地访问 #1 功能的一种方式。手册页记录了如何使用#2,但在汇编中您正在使用#1。整体语义是相同的,但访问它们的方式的细节不同。
特别是,C 函数 (#2) 遵循这样的约定:通过从函数返回
-1
并设置变量errno
来指示错误。然而,这对于 #1 来说并不是一种指示错误的便捷方式。errno
是位于程序内存中某处的全局(或线程局部)变量;内核不知道在哪里,而且告诉它也很尴尬,所以内核不能轻易地直接写入这个变量。对于内核来说,以其他方式返回错误代码更简单,并将其留给 C 库来设置 errno 变量。基于BSD的操作系统通常遵循的约定是内核系统调用(#1)将根据是否发生错误来设置或清除进位标志。如果没有发生错误,
rax
包含系统调用的返回值(这里是读取的字节数);如果确实发生错误,eax
包含错误代码(通常是一个 32 位值,因为errno
是一个int
)。因此,如果您使用汇编语言编写,那么您应该会看到这样的情况。至于内核如何设置/清除进位标志,当系统调用完成时,内核会执行
sysret
指令将控制权转移回用户空间。该指令的功能之一是从r11
恢复rflags
寄存器。当系统调用开始时,内核将保存进程的原始rflags
,因此它只需在之前设置或清除该 64 位值中的低位(即进位标志所在的位置)即可或者将其加载到r11
中以准备sysret
后。然后,当您的进程继续执行系统调用后面的指令时,进位标志将处于相应的状态。cmp
指令当然是 x86 CPU 设置进位标志的一种方式,但它绝不是唯一方式。即使是这样,您在用户空间程序中看不到该代码也不会感到惊讶,因为是内核决定了它的设置方式。为了实现 #2,C 库的
read()
函数需要在内核约定 (#1) 和 C 程序员期望的 (#2) 之间进行接口,因此他们必须编写一些代码来检查进位标志并根据需要填充errno
。他们的此函数的代码可能如下所示:有更多信息MacOS 程序集的 64 位系统调用文档。我希望我能引用一些更权威的文档,但我不知道在哪里可以找到它。这里的内容似乎是“常识”。
Part of the confusion is that the term "system call" is used for two things that are really different:
The actual request to the kernel to read from a file, as invoked by executing the
syscall
instruction.The C function
read()
, provided by the userspace C library as a way for C programs to conveniently access the functionality of #1.The man page documents how to use #2, but in assembly you are working with #1. The overall semantics are the same, but the details of how you access them are different.
In particular, the C function (#2) follows the convention that errors are indicated by returning
-1
from the function and setting the variableerrno
. However, this is not a convenient way for #1 to indicate errors.errno
is a global (or thread-local) variable located somewhere in the program's memory; the kernel doesn't know where, and it would be awkward to tell it, so the kernel can't easily write this variable directly. It's simpler for the kernel to return error codes some other way, and leave it up to the C library to set theerrno
variable.The convention that BSD-based operating systems generally follow is that the kernel system call (#1) will set or clear the carry flag according to whether an error occurred. If no error occurred,
rax
contains the system call's return value (here, the number of bytes read); if an error did occur,eax
contains the error code (it's normally a 32-bit value, sinceerrno
is anint
). So if you are writing in assembly, that is what you should expect to see.As to how the kernel manages to set/clear the carry flag, when the system call is complete, the kernel executes the
sysret
instruction to transfer control back to user space. One of the functions of this instruction is to restore therflags
register fromr11
. The kernel will have saved your process's originalrflags
when the system call began, so it merely has to set or clear the low-order bit (that's where the carry flag is) in this 64-bit value before or after loading it intor11
in preparation forsysret
. Then when your process continues execution with the instruction following yoursyscall
, the carry flag will be in the corresponding state.The
cmp
instruction is certainly one of the ways that an x86 CPU can set the carry flag, but it's by no means the only way. And even if it were, it shouldn't surprise you not to see that code in the userspace program, since it's the kernel that determines how it is set.In order to implement #2, the C library's
read()
function needs to interface between the kernel's convention (#1) and what the C programmer is expecting (#2), so they have to write some code to check the carry flag and populateerrno
if needed. Their code for this function could look something like the following:There is some more info at 64-bit syscall documentation for MacOS assembly. I wish I could cite some more authoritative documentation, but I don't know where to find it. What's here seems to be "common knowledge".