我有一个程序的几个变体,我想比较它们的性能。 两者执行的任务基本相同。
一切都用 C 语言和内存来完成。 另一个调用外部实用程序并执行文件 IO。
我如何可靠地比较它们?
1) 使用“time”获取“CPU 时间”有利于调用 system() 和执行 IO 的第二种变体。 即使我将“系统”时间添加到“用户”时间,它仍然不会计入 wait() 上阻塞的时间。
2)我不能只给它们计时,因为它们在服务器上运行并且可以随时从CPU上移走。 对 1000 次实验进行平均是一个软选项,因为我不知道如何利用我的服务器 - 它是集群上的虚拟机,有点复杂。
3)分析器没有帮助,因为它们会给我在代码中花费的时间,这再次有利于执行 system() 的版本
我需要将这些程序消耗的所有 CPU 时间相加,包括用户、内核、IO 和子进程递归地。
我预计这是一个常见问题,但似乎仍然没有找到解决方案。
(用 times() 解决 - 见下文。谢谢大家)
I have a couple variants of a program that I want to compare on performance. Both perform essentially the same task.
One does it all in C and memory. The other calls an external utility and does file IO.
How do I reliably compare them?
1) Getting "time on CPU" using "time" favors the second variant for calling system() and doing IO. Even if I add "system" time to "user" time, it'll still not count for time spent blocked on wait().
2) I can't just clock them for they run on a server and can be pushed off the CPU any time. Averaging across 1000s of experiments is a soft option, since I have no idea how my server is utilized - it's a VM on a cluster, it's kind of complicated.
3) profilers do not help since they'll give me time spent in the code, which again favors the version that does system()
I need to add up all CPU time that these programs consume, including user, kernel, IO, and children's recursively.
I expected this to be a common problem, but still don't seem to find a solution.
(Solved with times() - see below. Thanks everybody)
发布评论
评论(5)
如果我明白的话,在 bash 命令行上输入“time myapplication”并不是您想要的。
如果你想要准确性,你必须使用分析器......你有来源,是吗?
尝试类似 Oprofile 或 Valgrind,或者看看这个了解更多信息扩展列表。
如果你没有来源,老实说我不知道......
If I've understood, typing "time myapplication" on a bash command line is not what you are looking for.
If you want accuracy, you must use a profiler... You have the source, yes?
Try something like Oprofile or Valgrind, or take a look at this for a more extended list.
If you haven't the source, honestly I don't know...
/usr/bin/time(不是 bash 中内置的“时间”)可以提供一些有趣的统计数据。
/usr/bin/time (not built-in "time" in bash) can give some interesting stats.
运行它们一千次,测量实际花费的时间,然后对结果进行平均。 这应该可以消除由于服务器上运行的其他应用程序而导致的任何差异。
Run them a thousand times, measure actual time taken, then average the results. That should smooth out any variances due to other applications running on your server.
我好像终于找到了。
姓名
times - 获取进程时间
概要
#include
描述
times() 将当前进程时间存储在 buf 的 tms 结构体中
指着。 struct tms 的定义如下:
子进程的时间是所有等待子进程的递归和。
我想知道为什么它还没有成为标准 CLI 实用程序。 或者也许我只是无知。
I seem to have found it at last.
NAME
times - get process times
SYNOPSIS
#include
DESCRIPTION
times() stores the current process times in the struct tms that buf
points to. The struct tms is as defined in :
The children's times are a recursive sum of all waited-for children.
I wonder why it hasn't been made a standard CLI utility yet. Or may be I'm just ignorant.
我可能倾向于将“time -o somefile”添加到系统命令的前面,然后将其添加到对主程序进行计时所给出的时间中以获得总计。 除非我必须多次这样做,否则我会找到一种方法来获取两次输出并将它们添加到屏幕上(使用 awk 或 shell 或 perl 或其他东西)。
I'd probably lean towards adding "time -o somefile" to the front of the system command, and then adding it to the time given by time'ing your main program to get a total. Unless I had to do this lots of times, then I'd find a way to take two time outputs and add them up to the screen (using awk or shell or perl or something).