系统调用的良好参考
我需要一些参考,但是一本好的参考,可能还有一些很好的例子。我需要它,因为我开始使用 NASM 汇编器编写汇编代码。我有这个参考:
http://bluemaster.iu.hio。 no/edu/dark/lin-asm/syscalls.html
非常好且有用,但它有很多限制,因为它没有解释其他寄存器中的字段。例如,如果我使用 write 系统调用,我知道我应该将 1 放入 EAX 寄存器中,并且 ECX 可能是指向字符串的指针,但是 EBX 和 EDX 呢?我也希望对此进行解释,EBX 确定输入(0 表示标准输入,1 表示其他内容等),而 EDX 是要输入的字符串的长度,等等。我希望你能理解我想要的东西,我找不到任何这样的材料,所以这就是我写在这里的原因。 提前致谢。
I need some reference but a good one, possibly with some nice examples. I need it because I am starting to write code in assembly using the NASM assembler. I have this reference:
http://bluemaster.iu.hio.no/edu/dark/lin-asm/syscalls.html
which is quite nice and useful, but it's got a lot of limitations because it doesn't explain the fields in the other registers. For example, if I am using the write syscall, I know I should put 1 in the EAX register, and the ECX is probably a pointer to the string, but what about EBX and EDX? I would like that to be explained too, that EBX determines the input (0 for stdin, 1 for something else etc.) and EDX is the length of the string to be entered, etc. etc. I hope you understood me what I want, I couldn't find any such materials so that's why I am writing here.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Linux 中的标准编程语言是 C。因此,系统调用的最佳描述将它们显示为要调用的 C 函数。鉴于它们作为 C 函数的描述以及如何将它们映射到汇编中的实际系统调用的知识,您将能够轻松地使用您想要的任何系统调用。
首先,您需要所有系统调用的参考,就像 C 程序员看到的那样。我所知道的最好的一个是 Linux 手册页项目,特别是系统调用部分。
让我们采取
write
系统调用作为示例,因为它是您问题中的系统调用。正如您所看到的,第一个参数是一个有符号整数,通常是 open 系统调用返回的文件描述符。这些文件描述符也可能是从父进程继承的,就像前三个文件描述符(0=stdin、1=stdout、2=stderr)通常发生的情况一样。第二个参数是指向缓冲区的指针,第三个参数是缓冲区的大小(作为无符号整数)。最后,该函数返回一个有符号整数,即写入的字节数,如果出现错误,则返回负数。现在,如何将其映射到实际的系统调用?有很多方法可以在 32 位 x86 上进行系统调用(根据您的寄存器名称,这可能就是您正在使用的方法);请注意,它在 64 位 x86 上完全不同(确保您在 32 位模式下进行汇编并链接 32 位可执行文件;请参阅 这个问题作为否则如何出错的示例)。 32 位 x86 中最古老、最简单且最慢的是 int $0x80 方法。
对于
int $0x80
方法,将系统调用号放在%eax
中,将参数放在%ebx
、%ecx中
、%edx
、%esi
、%edi
和%ebp
按此顺序。然后调用int $0x80
,系统调用的返回值位于%eax
上。请注意,此返回值与参考文献所说的不同;该参考显示了 C 库将如何返回它,但系统调用在出错时返回-errno
(例如-EINVAL
)。在这种情况下,C 库会将其移至errno
并返回-1
。请参阅 syscalls(2) 和介绍(2)更多细节。因此,在
write
示例中,您可以将write
系统调用号放入%eax
中,将第一个参数(文件描述符号)放入 < code>%ebx,第二个参数(指向字符串的指针)在%ecx
中,第三个参数(字符串的长度)在%edx
中。系统调用将在%eax
中返回写入的字节数或取反的错误号(如果返回值在 -1 到 -4095 之间,则为取反的错误号)。最后,如何找到系统调用号?它们可以在
/usr/include/linux/unistd.h
中找到。在我的系统上,这仅包括/usr/include/asm/unistd.h
,它最终包括/usr/include/asm/unistd_32.h
,所以数字是(对于write
,您可以看到__NR_write
是4
)。错误号也是如此,它来自/usr/include/linux/errno.h
(在我的系统上,在追踪包含链之后,我在/usr/ 中找到了第一个错误号) include/asm-generic/errno-base.h
其余部分位于/usr/include/asm-generic/errno.h
)。对于使用其他常量或结构的系统调用,它们的文档告诉您应该查看哪些标头来查找相应的定义。现在,正如我所说,
int $0x80
是最古老且最慢的方法。较新的处理器具有更快的特殊系统调用指令。为了使用它们,内核提供了一个虚拟动态共享对象(vDSO
;它就像一个共享库,但仅在内存中),其中包含一个函数,您可以调用该函数来使用最好的系统调用适用于您的硬件的方法。它还提供了特殊的函数来获取当前时间,甚至无需进行系统调用和其他一些操作。当然,如果不使用动态链接器的话,使用起来会有点困难。还有另一种较旧的方法,
vsyscall
,它与vDSO
类似,但在固定地址使用单个页面。此方法已被弃用,如果您使用最新的内核,将导致系统日志上出现警告,可以在更新的内核上启动时禁用,并且将来可能会被删除。不要使用它。The standard programming language in Linux is C. Because of that, the best descriptions of the system calls will show them as C functions to be called. Given their description as a C function and a knowledge of how to map them to the actual system call in assembly, you will be able to use any system call you want easily.
First, you need a reference for all the system calls as they would appear to a C programmer. The best one I know of is the Linux man-pages project, in particular the system calls section.
Let's take the
write
system call as an example, since it is the one in your question. As you can see, the first parameter is a signed integer, which is usually a file descriptor returned by theopen
syscall. These file descriptors could also have been inherited from your parent process, as usually happens for the first three file descriptors (0=stdin, 1=stdout, 2=stderr). The second parameter is a pointer to a buffer, and the third parameter is the buffer's size (as an unsigned integer). Finally, the function returns a signed integer, which is the number of bytes written, or a negative number for an error.Now, how to map this to the actual system call? There are many ways to do a system call on 32-bit x86 (which is probably what you are using, based on your register names); be careful that it is completely different on 64-bit x86 (be sure you are assembling in 32-bit mode and linking a 32-bit executable; see this question for an example of how things can go wrong otherwise). The oldest, simplest and slowest of them in the 32-bit x86 is the
int $0x80
method.For the
int $0x80
method, you put the system call number in%eax
, and the parameters in%ebx
,%ecx
,%edx
,%esi
,%edi
, and%ebp
, in that order. Then you callint $0x80
, and the return value from the system call is on%eax
. Note that this return value is different from what the reference says; the reference shows how the C library will return it, but the system call returns-errno
on error (for instance-EINVAL
). The C library will move this toerrno
and return-1
in that case. See syscalls(2) and intro(2) for more detail.So, in the
write
example, you would put thewrite
system call number in%eax
, the first parameter (file descriptor number) in%ebx
, the second parameter (pointer to the string) in%ecx
, and the third parameter (length of the string) in%edx
. The system call will return in%eax
either the number of bytes written, or the error number negated (if the return value is between -1 and -4095, it is a negated error number).Finally, how do you find the system call numbers? They can be found at
/usr/include/linux/unistd.h
. On my system, this just includes/usr/include/asm/unistd.h
, which finally includes/usr/include/asm/unistd_32.h
, so the numbers are there (forwrite
, you can see__NR_write
is4
). The same goes for the error numbers, which come from/usr/include/linux/errno.h
(on my system, after chasing the inclusion chain I find the first ones at/usr/include/asm-generic/errno-base.h
and the rest at/usr/include/asm-generic/errno.h
). For the system calls which use other constants or structures, their documentation tells which headers you should look at to find the corresponding definitions.Now, as I said,
int $0x80
is the oldest and slowest method. Newer processors have special system call instructions which are faster. To use them, the kernel makes available a virtual dynamic shared object (thevDSO
; it is like a shared library, but in memory only) with a function you can call to do a system call using the best method available for your hardware. It also makes available special functions to get the current time without even having to do a system call, and a few other things. Of course, it is a bit harder to use if you are not using a dynamic linker.There is also another older method, the
vsyscall
, which is similar to thevDSO
but uses a single page at a fixed address. This method is deprecated, will result in warnings on the system log if you are using recent kernels, can be disabled on boot on even more recent kernels, and might be removed in the future. Do not use it.如果您下载该网页(如第二段中所示)并下载内核源代码,则可以单击“源”列中的链接,直接转到实现系统调用的源文件。您可以阅读它们的 C 签名来了解每个参数的用途。
如果您只是寻找快速参考,那么每个系统调用都有一个同名但减去
sys_
的 C 库接口。因此,例如,您可以查看man 2 lseek
获取有关sys_lseek
参数的信息:如您所见,参数与 HTML 表中的参数匹配:
If you download that web page (like it suggests in the second paragraph) and download the kernel sources, you can click the links in the "Source" column, and go directly to the source file that implements the system calls. You can read their C signatures to see what each parameter is used for.
If you're just looking for a quick reference, each of those system calls has a C library interface with the same name minus the
sys_
. So, for example, you could check outman 2 lseek
to get the information about the parameters forsys_lseek
:where, as you can see, the parameters match the ones from your HTML table: