如何分析在 FreeBSD 上运行的持续运行的服务器
可能的重复:
在终止进程之前保存 gmon.out
我正在尝试分析Linux 环境下的服务器(我可以使用源代码。c 代码)。该服务器像网络服务器一样持续运行。我正在尝试使用 gprof 来分析服务器。如果服务器自行退出,则会生成 gmon.out 文件。我可以使用 gprof 和 gmon.out 来理解分析数据。现在我遇到的问题是,该服务器正在连续运行,等待传入的套接字连接、请求等。如果我终止该服务器,则不会生成 gmon.out。此时我看到以下选项。
- 更改源代码以分析自身并在收到 SIGKILL 信号后记录此信息。这是迄今为止最丑陋的解决方案,并且可能会在测量中引入噪声。
- 也许有一种方法可以在服务器仍在运行时使用 gprof 来分析该服务器。
- 可以尝试其他工具吗?
编辑:服务器是多进程服务器。在 FreeBSD 7.2 上运行
我确信,人们以前已经解决过此类问题。我未能找到有关 SO 或外部的有用信息。
我很欣赏人们的任何想法/解决方案。
非常感谢。
更新1:
- gprof 似乎不适用于多进程服务器。当我在执行服务器后设法获取 gmon.out 时,只有父进程被检测,这实际上并没有做真正的工作!
- oProfile 不支持我的服务器运行的 FreeBSD。由于各种原因我不能(不允许)更改操作系统。
- Valgrind 网站没有 FreeBSD 的端口。但有一些参考文献提到了 FreeBSD 的移植。我找不到 FreeBSD 端口源。
不知何故,我设法获得了 valgrind 的端口。当我运行 make 时,出现以下错误。
=> valgrind-stable-352.tar.gz doesn't seem to exist in /usr/obj/ports/distfiles/.
=> Attempting to fetch from ftp://ftp.FreeBSD.org/pub/FreeBSD/ports/distfiles/.
fetch: ftp://ftp.FreeBSD.org/pub/FreeBSD/ports/distfiles/valgrind-stable-352.tar.gz: File unavailable (e.g., file not found, no access)
=> Attempting to fetch from http://www.rabson.org/.
fetch: http://www.rabson.org/valgrind-stable-352.tar.gz: No address record
=> Couldn't fetch it - please try to retrieve this
=> port manually into /usr/obj/ports/distfiles/ and try again.
*** Error code 1
我试图在网上找到 valgrind-stable-352.tar.gz 。我发现的所有链接都已失效。
我在 freebsd 上安装了 pstack,并且实现的 pstack 仅提供堆栈跟踪。参考:http://sourceforge.net/projects/bsd-pstack/
我的理解是,systemtap 仅适用于内核空间事件、检测等。
我可能是错的,或者信息不足。请纠正我并给出你的想法。我真的很感谢你的帮助。
更新2: 我认为提供有关我要分析的服务器的一些详细信息会很有帮助。
- 它是多服务器程序。 I/O 限制,针对特定的 mysql 数据库。
- 不涉及任何线程。每个子服务器进程仅处理一个请求。服务器启动时会创建可配置数量的进程。
- 我想找到每个功能所花费的时间及其频率。功能代码是 CPU 绑定和 IO 绑定的混合(我相信更多的是 IO)。
- 它运行在
- 用 c 编写的 FreeBSD 7.2 上。通过该服务器读取数据库的次数远大于写入次数。
Possible Duplicate:
Saving gmon.out before killing a process
I'm trying to profile a server (source code available to me. c-code) on Linux environment. This server runs continuously like a web server. I'm trying to use gprof to profile the server. If the server exits by itself, gmon.out file is generated. I can use gprof and gmon.out to understand profiled data. Now the problem I have is, this server is running continuously, waiting for incoming socket connections, requests etc. If I kill this server, gmon.out is not generated. At this point I see the following options.
- change the source code to profile itself and log this information after receiving SIGKILL signal. This is by far the ugliest solution and may introduce noise in the measurement.
- Maybe there is a way to profile this server using gprof while the server is still running.
- Other tools to try?
EDIT: The server is multi-process server. running on FreeBSD 7.2
I'm sure, people have solved these kind of problems before. I failed to find useful information on SO or outside.
I appreciate any thoughts/solutions people have.
Thanks a bunch.
UPDATE 1:
- gprof doesn't seem to work with multi-process server. When I manage to get gmon.out after executing my server, only parent process is instrumented which actually doesnt do real work!.
- oProfile doesn't support FreeBSD which is what my server is running on. For various reasons I can't(not allowed to) change OS.
- Valgrind website doesnt have a port for FreeBSD. But there are some references to a port to FreeBSD. I failed to find FreeBSD port source.
Somehow I managed to get ports for valgrind. When I run make I get the following errors.
=> valgrind-stable-352.tar.gz doesn't seem to exist in /usr/obj/ports/distfiles/.
=> Attempting to fetch from ftp://ftp.FreeBSD.org/pub/FreeBSD/ports/distfiles/.
fetch: ftp://ftp.FreeBSD.org/pub/FreeBSD/ports/distfiles/valgrind-stable-352.tar.gz: File unavailable (e.g., file not found, no access)
=> Attempting to fetch from http://www.rabson.org/.
fetch: http://www.rabson.org/valgrind-stable-352.tar.gz: No address record
=> Couldn't fetch it - please try to retrieve this
=> port manually into /usr/obj/ports/distfiles/ and try again.
*** Error code 1
I tried to find valgrind-stable-352.tar.gz on web. All of the links I found are dead.
I got pstack installed on my freebsd and the realised pstack gives only stack trace. reference : http://sourceforge.net/projects/bsd-pstack/
My understanding is that systemtap is only for kernel-space events, instrumentation etc.
I could be wrong or have insufficient information. Please correct me and give your thoughts. I really appreciate your assistance.
UPDATE 2:
I think it will be helpful to give some details about the server that I'm trying to profile.
- it is multi-server program. I/O bound, to be specific mysql database.
- No threads involved. Each child-server-process handles only one request. configurable number of processes are created when the server starts.
- I would like to find time spent in each function and its frequency. function codes are a mix of CPU-bound and IO bound (I believe more IO).
- it is running on FreeBSD 7.2
- written in c. number of reads is much greater than writes to the database via this server.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
虽然您当然应该谨慎对待关键生产系统的分析,
使用 oprofile 或/和 systemtap > ,它们可能已经包含在您的发行版中。
While you certainly should take your precations on profiling critical production systems,
use oprofile or/and systemtap , They're likely included with your distro already.
即使你让 gprof 为你服务,也会有问题。
它对任何系统调用或 I/O 都是盲目的。它基于这样的假设:您永远不会进行不必要的挂起。它只关注 CPU 限制的问题。
如果有递归,它就无法处理。
它给出的时间基于不稳定的假设,例如每次调用例程都花费大约相同的时间。它没有为您提供行级信息。
测量是一回事,但如果你想找到做不必要事情的“瓶颈”,无论是 CPU 还是 I/O,一个非常粗糙但有效的工具是 lsstack(我认为 SourceForge 上有) 。
另外,请查看缩放。它是一个适用于 Linux 的实时堆栈采样器。它给出了行级百分比,我相信它可以附加到正在运行的进程或从正在运行的进程中分离。
Even if you get gprof to serve you, there are problems.
It is blind to any system calls or I/O. It is based on the assumption that you will never do an unnecessary hang. It only looks at CPU-bound issues.
If there is any recursion, it just can't handle it.
The times it gives you are based on shaky assumptions, such as that every call to a routine takes about the same amount of time. It gives you no line-level information.
Measuring is one thing, but if you want to find "bottlenecks" that are doing unnecessary things, whether CPU or I/O, a very rough but effective tool is
lsstack
(which I think is on SourceForge).Also, take a look at Zoom. It is a wall-time stack-sampler for Linux. It gives line-level percents, and I believe it can be attached and detached from a running process.
您可以使用 PmcTools - FreeBSD 的类似 oProfile 的替代方案。
You could just use PmcTools - FreeBSD's oProfile-like alternative.
您可以覆盖 SIGTERM 处理程序以调用
exit(0)
,这将导致 gprof 生成常用的gmon.out
。You can override the SIGTERM handler to call
exit(0)
which will cause gprof to generate the usualgmon.out
.通过某种方法(可能是通过套接字发送的命令)扩展您的服务器以顺利退出它,然后您就得到了 gmon.out。或者我错过了一些东西,并且完全不可能让它退出而不杀死它?
Extend your server by a method (a command sent through a socket perhaps) to quit it smoothly and there you have your gmon.out. Or am I missing something and it's totally not possible to let it exit without killing it?
如果您能够尝试使用 fedora/rhel linux 进行开发测试,那么 systemtap 应该可以让您很好地了解服务器进程。例如,如果您希望在用户空间程序中对活动函数进行采样,像这样相对简单的东西可能会有所帮助:
# stap -e 'global fns;探测定时器.profile {if (user_mode()) fns[usymdata(uaddr())] <<< 1 }' -d /bin/yourserver -d /lib/yourlibrary.so -d /lib/yourotherlibrary.so
^C
完成后。报告可能类似于fns["memset /lib64/libc-2.12.so+0xa7d/0xb20"] @count=0x56 @min=0x1 @max=0x1 @sum=0x56 @avg=0x1
fns["memset /lib64/libc-2.12.so+0x560/0xb20"] @count=0x12 @min=0x1 @max=0x1 @sum=0x12 @avg=0x1
fns[ "__GI_strlen /lib64/libc-2.12.so+0x0/0x50"] @count=0x4 @min=0x1 @max=0x1 @sum=0x4 @avg=0x1
fns["gobble_file /bin/ ls+0x729/0xc70"] @count=0x1 @min=0x1 @max=0x1 @sum=0x1 @avg=0x1
fns["getuser /bin/ls+0x1c/0xa0"] @count =0x1 @min=0x1 @max=0x1 @sum=0x1 @avg=0x1
fns["getuser /bin/ls+0x23/0xa0"] @count=0x1 @min=0x1 @max= 0x1 @sum=0x1 @avg=0x1
If you're able to try a fedora/rhel linux box for development testing, systemtap there should give you good visibility into your server processes. For example, if you wish to sample active functions in userspace programs, something relatively simple like this may help:
# stap -e 'global fns; probe timer.profile {if (user_mode()) fns[usymdata(uaddr())] <<< 1 }' -d /bin/yourserver -d /lib/yourlibrary.so -d /lib/yourotherlibrary.so
^C
when you're done. A report may look likefns["memset /lib64/libc-2.12.so+0xa7d/0xb20"] @count=0x56 @min=0x1 @max=0x1 @sum=0x56 @avg=0x1
fns["memset /lib64/libc-2.12.so+0x560/0xb20"] @count=0x12 @min=0x1 @max=0x1 @sum=0x12 @avg=0x1
fns["__GI_strlen /lib64/libc-2.12.so+0x0/0x50"] @count=0x4 @min=0x1 @max=0x1 @sum=0x4 @avg=0x1
fns["gobble_file /bin/ls+0x729/0xc70"] @count=0x1 @min=0x1 @max=0x1 @sum=0x1 @avg=0x1
fns["getuser /bin/ls+0x1c/0xa0"] @count=0x1 @min=0x1 @max=0x1 @sum=0x1 @avg=0x1
fns["getuser /bin/ls+0x23/0xa0"] @count=0x1 @min=0x1 @max=0x1 @sum=0x1 @avg=0x1
你可能想看看 Dyninst : http://www.dyninst.org/
它是一个 ptrace()基于 API,用于动态添加和删除运行代码中的检测。您可以使用它进行调试、分析等。
祝您好运。
You might want to look at Dyninst : http://www.dyninst.org/
It is a ptrace()-based API for dynamically adding and removing instrumentation to running code. You can use it for debugging, profiling, etc.
Good luck.
我不太关心这个问题,但是 DTrace 不能用来做到这一点吗?
FreeBSD 刚刚改进了对此的支持。
http://wiki.freebsd.org/DTrace/userland
I'm not too much into the matter, but can't DTrace be used to do this?
FreeBSD just improved support for that.
http://wiki.freebsd.org/DTrace/userland
PMP 可能就是这种情况
This might be a case for PMP