分析 Ruby 程序调用的 C 共享库

发布于 2024-08-18 06:09:04 字数 1200 浏览 2 评论 0原文

我有一个用Ruby和C编写的程序。C部分是一个共享库,它是Ruby程序的扩展。我想使用 gprof 分析我编写的 C 共享库。我像这样编译共享库:

gcc -I. -I/usr/lib/ruby/1.8/i486-linux -I/usr/lib/ruby/1.8/i486-linux -I. -D_FILE_OFFSET_BITS=64  -fPIC -fno-strict-aliasing -g -march=i686 -O2 -ggdb -pg -fPIC -c extension.c
gcc -shared -o extension.so extension.o -L. -L/usr/lib -L. -Wl,-Bsymbolic-functions -rdynamic -Wl,-export-dynamic  -lruby1.8  -lpthread -lrt -ldl -lcrypt -lm -lc

然后执行 ruby​​ 程序,该程序加载此共享库,并且我期望当前目录中有一个 gmon.out 文件,但由于某种原因,文件 gmon.out 没有被创建。我该怎么做?

我用谷歌搜索了这个,但找不到满意的答案(有效)。

PS - 作为一种解决方法,我可以拥有扩展的修改版本,它是一个纯 C 程序(而不是创建为共享库),我可以使用它来进行分析,但维护同一 C 扩展的两个版本变得很乏味(两者之间存在大量差异)。

我也尝试编写一个直接使用共享库的 C 程序。我立即在共享库初始化期间调用的 ruby​​ 库函数之一中出现页面错误。我认为它确实期望从 ruby​​ 程序加载,这可能在内部发挥了一些作用。

(gdb) bt
#0  0x0091556e in st_lookup () from /usr/lib/libruby1.8.so.1.8
#1  0x008e87c2 in rb_intern () from /usr/lib/libruby1.8.so.1.8
#2  0x008984a5 in rb_define_module () from /usr/lib/libruby1.8.so.1.8
#3  0x08048dd0 in Init_SimilarStr () at extension.c:542
#4  0x0804933e in main (argc=2, argv=0xbffff454) at extension.c:564

更新:没关系。我使用 #ifdef 编译出扩展的 Ruby 部分并获取配置文件。结束。

I have a program written in Ruby and C. The C portion is a shared library, which is an extension for the Ruby program. I want to profile the C shared library I wrote, using gprof. I compile the shared library like this:

gcc -I. -I/usr/lib/ruby/1.8/i486-linux -I/usr/lib/ruby/1.8/i486-linux -I. -D_FILE_OFFSET_BITS=64  -fPIC -fno-strict-aliasing -g -march=i686 -O2 -ggdb -pg -fPIC -c extension.c
gcc -shared -o extension.so extension.o -L. -L/usr/lib -L. -Wl,-Bsymbolic-functions -rdynamic -Wl,-export-dynamic  -lruby1.8  -lpthread -lrt -ldl -lcrypt -lm -lc

Then I execute the ruby program, which loads this shared library, and I expect a gmon.out file in the current directory, but for some reason the file gmon.out does not get created. How do I do this?

I googled for this but couldn't find a satisfactory answer (which worked).

P.S. - As a workaround I can have a modified version of the extension which is a pure C program (instead of being created as a shared library) that I can use to profile, but it becomes tedious to maintain two versions of the same C extension (large number of differences between the two).

I tried writing a C program which uses the shared library directly too. I immediately get a page fault in one of the ruby library functions which get called during the initialization of the shared library. I think it's really expecting to be loaded from a ruby program, which may internally be doing some magic.

(gdb) bt
#0  0x0091556e in st_lookup () from /usr/lib/libruby1.8.so.1.8
#1  0x008e87c2 in rb_intern () from /usr/lib/libruby1.8.so.1.8
#2  0x008984a5 in rb_define_module () from /usr/lib/libruby1.8.so.1.8
#3  0x08048dd0 in Init_SimilarStr () at extension.c:542
#4  0x0804933e in main (argc=2, argv=0xbffff454) at extension.c:564

Update: Never mind. I used #ifdef to compile out Ruby portions of the extension and get a profile. Closing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

∞梦里开花 2024-08-25 06:09:04

在这种情况下,我发现 oprofile 是比 gprof 更好的选择。 来自 oprofile 的报告更加全面。我使用 #ifndef PROFILE 从 C 扩展中编译出了导致段错误的 ruby​​ 部分(并非全部),并将它们替换为非 ruby​​ 代码。我在扩展本身中编写了一个 main() 例程,以调用扩展中的函数。然后我设置了一个 makefile 将扩展编译为定义了 PROFILE 的 C 程序。然后我在Ubuntu上安装了oprofile< /a>.写了这个脚本。

#!/bin/bash
sudo opcontrol --reset
sudo opcontrol --start
./a.out Rome Damascus NewYork Delhi Bangalore
sudo opcontrol --shutdown
opreport -lt1

编译我的程序,并执行上面的脚本,它从“opreport”命令中给出如下输出:

...
...
Killing daemon.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:10675 range:0x920000-0x921000) could not be found.
warning: [vdso] (tgid:1270 range:0xba1000-0xba2000) could not be found.
warning: [vdso] (tgid:1675 range:0x973000-0x974000) could not be found.
warning: [vdso] (tgid:1711 range:0x264000-0x265000) could not be found.
warning: [vdso] (tgid:1737 range:0x990000-0x991000) could not be found.
warning: [vdso] (tgid:2477 range:0xa53000-0xa54000) could not be found.
warning: [vdso] (tgid:5658 range:0x7ae000-0x7af000) could not be found.
CPU: Core Solo / Duo, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        app name                 symbol name
12731    32.8949  a.out                    levenshtein
11958    30.8976  a.out                    corpora_pass2
5231     13.5161  no-vmlinux               /no-vmlinux
4021     10.3896  a.out                    corpora_pass1
1733      4.4778  libc-2.10.1.so           /lib/tls/i686/cmov/libc-2.10.1.so
542       1.4004  ld-2.10.1.so             /lib/ld-2.10.1.so
398       1.0284  a.out                    method_top_matches

就是这样:顶级消费者是函数 levenshtein()。我接着使用另一个命令来生成反汇编的输出,并用源代码和每行的执行计数/时间进行注释。看起来像这样(计数/时间位于每个执行行的左侧):

> opannotate --source --assembly ./a.out > report.as.handcoded.1
> cat report.as.handcoded.1

...
...
...
           :         __asm__ (
 2  0.0069 : 804918a:       mov    -0x50(%ebp),%ecx
 4  0.0137 : 804918d:       mov    -0x54(%ebp),%ebx
           : 8049190:       mov    -0x4c(%ebp),%eax
12  0.0412 : 8049193:       cmp    %eax,%ecx
10  0.0344 : 8049195:       cmovbe %ecx,%eax
 8  0.0275 : 8049198:       cmp    %eax,%ebx
11  0.0378 : 804919a:       cmovbe %ebx,%eax
16  0.0550 : 804919d:       mov    %eax,-0x4c(%ebp)
           :                   "cmp     %0, %2\n\t"
           :                   "cmovbe  %2, %0\n\t"
           :                  : "+r"(a) :
           :                    "%r"(b), "r"(c)
           :                  );
           :          return a;
 ...
 ...
 ...

I found oprofile to be a lot better option than gprof for profiling, in this situation. The reports from oprofile are much more comprehensive. I compiled out the ruby portions which were causing the seg-faults (not all of them were), from the C extension using #ifndef PROFILE, and replaced them with non-ruby code. I wrote a main() routine within the extension itself, to call the functions in the extension. Then I set up a makefile to compile the extension as a C program with PROFILE defined. Then I installed oprofile on Ubuntu. Wrote this script.

#!/bin/bash
sudo opcontrol --reset
sudo opcontrol --start
./a.out Rome Damascus NewYork Delhi Bangalore
sudo opcontrol --shutdown
opreport -lt1

Compiled my program, and executed the above script, which gives output like this from the "opreport" command:

...
...
Killing daemon.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:10675 range:0x920000-0x921000) could not be found.
warning: [vdso] (tgid:1270 range:0xba1000-0xba2000) could not be found.
warning: [vdso] (tgid:1675 range:0x973000-0x974000) could not be found.
warning: [vdso] (tgid:1711 range:0x264000-0x265000) could not be found.
warning: [vdso] (tgid:1737 range:0x990000-0x991000) could not be found.
warning: [vdso] (tgid:2477 range:0xa53000-0xa54000) could not be found.
warning: [vdso] (tgid:5658 range:0x7ae000-0x7af000) could not be found.
CPU: Core Solo / Duo, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        app name                 symbol name
12731    32.8949  a.out                    levenshtein
11958    30.8976  a.out                    corpora_pass2
5231     13.5161  no-vmlinux               /no-vmlinux
4021     10.3896  a.out                    corpora_pass1
1733      4.4778  libc-2.10.1.so           /lib/tls/i686/cmov/libc-2.10.1.so
542       1.4004  ld-2.10.1.so             /lib/ld-2.10.1.so
398       1.0284  a.out                    method_top_matches

There it is: the top consumer is the function levenshtein(). I followed this by another command to generate disassembled output annotated with source code and execution count/time of each line. This looks like this (counts/times are on the left of each executed line):

> opannotate --source --assembly ./a.out > report.as.handcoded.1
> cat report.as.handcoded.1

...
...
...
           :         __asm__ (
 2  0.0069 : 804918a:       mov    -0x50(%ebp),%ecx
 4  0.0137 : 804918d:       mov    -0x54(%ebp),%ebx
           : 8049190:       mov    -0x4c(%ebp),%eax
12  0.0412 : 8049193:       cmp    %eax,%ecx
10  0.0344 : 8049195:       cmovbe %ecx,%eax
 8  0.0275 : 8049198:       cmp    %eax,%ebx
11  0.0378 : 804919a:       cmovbe %ebx,%eax
16  0.0550 : 804919d:       mov    %eax,-0x4c(%ebp)
           :                   "cmp     %0, %2\n\t"
           :                   "cmovbe  %2, %0\n\t"
           :                  : "+r"(a) :
           :                    "%r"(b), "r"(c)
           :                  );
           :          return a;
 ...
 ...
 ...
夜声 2024-08-25 06:09:04

你可以比gprof做得更好。考虑堆栈快照。您可以使用pstacklsstack(如果您可以获得的话)或通过在调试器下手动暂停来完成此操作。 这是该技术的简短介绍。

You can do better than gprof. Consider stackshots. You can do it using pstack, lsstack (if you can get it), or by manually pausing under the debugger. Here's a short intro to the technique.

撕心裂肺的伤痛 2024-08-25 06:09:04

您可以通过分析器运行 Ruby 解释器本身。如果这太多了,请编写一个小型 C 程序来加载共享库并调用其导出函数。然后分析该 C 程序。它使您无需维护该库的两个版本。

You can run the ruby interpreter itself through the profiler. If that is too much, write a small C program that loads the shared library and calls its exported functions. Then profile that C program. It saves you from maintaining two versions of the library.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文