为什么许多系统调用（getpid）仅使用 strace 捕获一次？

发布于 2024-11-03 04:42:47 字数 1615 浏览 7 评论 0原文

我在程序中多次调用getpid()（以测试系统调用的效率），但是当我使用strace获取跟踪时，只有一次 getpid() 调用被捕获。

代码很简单：

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

void print_usage(){
    printf("Usage: program count\n");
    exit(-1);
}

int main(int argc, char** argv){
    if(argc != 2)
        print_usage();
    int cnt = atoi(argv[1]);
    int i = 0;
    while(i++<cnt)
        getpid();
    return 0;
}

我使用了 gdb 并得到了这个：

(gdb) disasse
Dump of assembler code for function getpid:
0xb76faac0 <getpid+0>:  mov    %gs:0x4c,%edx
0xb76faac7 <getpid+7>:  cmp    $0x0,%edx
0xb76faaca <getpid+10>: mov    %edx,%eax
0xb76faacc <getpid+12>: jle    0xb76faad0 <getpid+16>
0xb76faace <getpid+14>: repz ret 
0xb76faad0 <getpid+16>: jne    0xb76faadc <getpid+28>
0xb76faad2 <getpid+18>: mov    %gs:0x48,%eax
0xb76faad8 <getpid+24>: test   %eax,%eax
0xb76faada <getpid+26>: jne    0xb76faace <getpid+14>
0xb76faadc <getpid+28>: mov    $0x14,%eax
0xb76faae1 <getpid+33>: call   *%gs:0x10
0xb76faae8 <getpid+40>: test   %edx,%edx
0xb76faaea <getpid+42>: mov    %eax,%ecx
0xb76faaec <getpid+44>: jne    0xb76faace <getpid+14>
0xb76faaee <getpid+46>: mov    %ecx,%gs:0x48
0xb76faaf5 <getpid+53>: ret

我不太理解汇编代码。如果有人能给出一些详细的解释也会很有帮助。根据我的观察，除了第一次getpid()调用之外，“call *%gs:0x10”（跳转到vdso）不会被执行，这可能是后续的原因getpid() 调用不会被捕获。但我不知道为什么。

Linux内核：2.6.24-29 海湾合作委员会（海合会）4.2.4 libc 2.7，

谢谢！

原文

I invoked getpid() in a program for many times (to test the efficiency of system calls), however when I use strace to get the trace, only one getpid() call is captured.

The code is simple:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

void print_usage(){
    printf("Usage: program count\n");
    exit(-1);
}

int main(int argc, char** argv){
    if(argc != 2)
        print_usage();
    int cnt = atoi(argv[1]);
    int i = 0;
    while(i++<cnt)
        getpid();
    return 0;
}

I used gdb and got this:

(gdb) disasse
Dump of assembler code for function getpid:
0xb76faac0 <getpid+0>:  mov    %gs:0x4c,%edx
0xb76faac7 <getpid+7>:  cmp    $0x0,%edx
0xb76faaca <getpid+10>: mov    %edx,%eax
0xb76faacc <getpid+12>: jle    0xb76faad0 <getpid+16>
0xb76faace <getpid+14>: repz ret 
0xb76faad0 <getpid+16>: jne    0xb76faadc <getpid+28>
0xb76faad2 <getpid+18>: mov    %gs:0x48,%eax
0xb76faad8 <getpid+24>: test   %eax,%eax
0xb76faada <getpid+26>: jne    0xb76faace <getpid+14>
0xb76faadc <getpid+28>: mov    $0x14,%eax
0xb76faae1 <getpid+33>: call   *%gs:0x10
0xb76faae8 <getpid+40>: test   %edx,%edx
0xb76faaea <getpid+42>: mov    %eax,%ecx
0xb76faaec <getpid+44>: jne    0xb76faace <getpid+14>
0xb76faaee <getpid+46>: mov    %ecx,%gs:0x48
0xb76faaf5 <getpid+53>: ret

I don't quite understand the assembly code. It would also be helpful if somebody can give some detailed explanation about it. According to my observation, "call *%gs:0x10" (, which jumps into vdso) is not executed, except for the first getpid() call, that may be the reason why subsequent getpid() calls are not captured. But I don't know why.

The linux kernel: 2.6.24-29
gcc (GCC) 4.2.4
libc 2.7,

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

佞臣 2024-11-10 04:42:47

Glibc 会缓存结果，因为它在调用之间无法更改。例如，请参阅此处的源代码。

所以真正的系统调用只执行一次。其他调用只是从缓存中读取。（代码不是很简单，因为它负责用线程做正确的事情。）

回复收藏 0 原文

山有枢 2024-11-10 04:42:47

glibc 缓存 pid 值。第一次调用 getpid 时，它会向内核询问 pid，下次它只返回从第一个 getpid 系统调用获得的值。

glibc 代码：

pid_t
__getpid (void)
{
#ifdef NOT_IN_libc
  INTERNAL_SYSCALL_DECL (err);
  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);
#else
  pid_t result = THREAD_GETMEM (THREAD_SELF, pid);
  if (__builtin_expect (result <= 0, 0))
    result = really_getpid (result);
#endif
  return result;
}

如果你想测试系统调用的开销，通常使用 gettimeofday() 来做到这一点 - 内核所做的工作非常小，编译器和 C 库都无法优化对其进行远程呼叫。

glibc caches the pid value. The first time you call getpid it asks the kernel for the pid, the next time it just returns value it got from the first getpid syscall.

glibc code:

pid_t
__getpid (void)
{
#ifdef NOT_IN_libc
  INTERNAL_SYSCALL_DECL (err);
  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);
#else
  pid_t result = THREAD_GETMEM (THREAD_SELF, pid);
  if (__builtin_expect (result <= 0, 0))
    result = really_getpid (result);
#endif
  return result;
}

If you want to test the overhead of syscalls, gettimeofday() is often used to do just that - the work done the kernel is very small, and neither the compiler nor the C library can optimize away calls to it.

回复收藏 0 原文

纸伞微斜 2024-11-10 04:42:47

如今，随着 pid_namespaces 的引入，以及在应用程序收到信号或通过调用创建子进程时检测到的大量错误syscall() 代替 fork()、vfork() 和clone()，pid不再缓存在GLIBC中。手册中指出了这一点：

从 glibc 版本 2.3.4 到版本 2.24（包括版本 2.24），
getpid() 缓存 PID 的 glibc 包装函数，目标是
当进程调用 getpid() 时避免额外的系统调用
反复。通常这种缓存是不可见的，但它是正确的
操作依赖于 fork(2) 包装函数的支持，
vfork(2) 和clone(2)：如果应用程序绕过了glibc
使用 syscall(2) 对这些系统调用进行包装，然后调用
子级中的 getpid() 会返回错误的值（是
精确：它将返回父进程的PID）。在
另外，在某些情况下 getpid() 可能会返回错误
即使通过 glibc 包装函数调用 clone(2) 时，该值也是如此。
（有关此类案例的讨论，请参阅克隆（2）中的错误。）
此外，缓存代码的复杂性是
多年来 glibc 中一些错误的根源。