我有一个 C 程序,旨在在多个处理器上并行运行。我需要能够记录执行时间(可能是 1 秒到几分钟)。我已经搜索了答案,但它们似乎都建议使用clock()
函数,该函数然后涉及计算程序所用的时钟数除以Clocks_per_second
值。
我不确定 Clocks_per_second
值是如何计算的?
在Java中,我只是获取执行前后的当前时间(以毫秒为单位)。
C中有类似的东西吗?我已经看过了,但我似乎找不到比第二个解决方案更好的方法。
我也知道分析器是一种选择,但我希望自己实现一个计时器。
谢谢
I have a C program that aims to be run in parallel on several processors. I need to be able to record the execution time (which could be anywhere from 1 second to several minutes). I have searched for answers, but they all seem to suggest using the clock()
function, which then involves calculating the number of clocks the program took divided by the Clocks_per_second
value.
I'm not sure how the Clocks_per_second
value is calculated?
In Java, I just take the current time in milliseconds before and after execution.
Is there a similar thing in C? I've had a look, but I can't seem to find a way of getting anything better than a second resolution.
I'm also aware a profiler would be an option, but am looking to implement a timer myself.
Thanks
发布评论
评论(18)
CLOCKS_PER_SEC
是在
中声明的常量。要获取 C 应用程序中任务使用的CPU 时间(不是挂机时间),请使用:请注意,这会以浮点类型返回时间。这可能比一秒更精确(例如,您测量的是 4.52 秒)。精度取决于架构;在现代系统上,您很容易获得 10 毫秒或更低的时间,但在较旧的 Windows 计算机(从 Win98 时代)上,它接近 60 毫秒。
clock()
是标准 C 语言;它“无处不在”。有一些特定于系统的函数,例如类 Unix 系统上的getrusage()
。Java 的
System.currentTimeMillis()
不测量相同的东西。它是一个“挂钟”:它可以帮助你测量程序执行花费了多少时间,但它并不能告诉你使用了多少CPU时间。在多任务系统(即所有系统)上,这些可能有很大不同。CLOCKS_PER_SEC
is a constant which is declared in<time.h>
. To get the CPU time (not the wall time) used by a task within a C application, use:Note that this returns the time as a floating point type. This can be more precise than a second (e.g. you measure 4.52 seconds). Precision depends on the architecture; on modern systems you easily get 10ms or lower, but on older Windows machines (from the Win98 era) it was closer to 60ms.
clock()
is standard C; it works "everywhere". There are system-specific functions, such asgetrusage()
on Unix-like systems.Java's
System.currentTimeMillis()
does not measure the same thing. It is a "wall clock": it can help you measure how much time it took for the program to execute, but it does not tell you how much CPU time was used. On a multitasking systems (i.e. all of them), these can be widely different.如果使用Unix shell运行,可以使用time命令。
假设
a.out 作为可执行文件将为您提供运行它所需的时间
If you are using the Unix shell for running, you can use the time command.
doing
assuming a.out as the executable will give u the time taken to run this
在普通香草C中:
In plain vanilla C:
您在功能上需要这样:
请注意,这以微秒为单位,而不仅仅是秒。
You functionally want this:
Note that this measures in microseconds, not just seconds.
(如果您的系统管理员更改了系统时间,或者您的时区有不同的冬季和夏季时间,那么这里的所有答案都缺乏。因此...)
在 Linux 上使用:
clock_gettime(CLOCK_MONOTONIC_RAW, &time_variable);
如果系统管理员更改时间,或者您居住的国家/地区冬令时与夏令时不同等,则不会受到影响。
manclock_gettime
指出:(All answers here are lacking, if your sysadmin changes the systemtime, or your timezone has differing winter- and sommer-times. Therefore...)
On Linux use:
clock_gettime(CLOCK_MONOTONIC_RAW, &time_variable);
It's not affected if the system-admin changes the time, or you live in a country with winter-time different from summer-time, etc.
man clock_gettime
states:大多数简单程序的计算时间以毫秒为单位。所以,我想,你会发现这很有用。
如果您想计算整个程序的运行时间并且您使用的是 Unix 系统,请使用 时间< /a> 像这样的命令
time ./a.out
Most of the simple programs have computation time in milli-seconds. So, i suppose, you will find this useful.
If you want to compute the runtime of the entire program and you are on a Unix system, run your program using the time command like this
time ./a.out
Thomas Pornin 的答案作为宏:
像这样使用它:
输出:
Thomas Pornin's answer as macros:
Use it like this:
Output:
很多答案都建议使用
time.h
中的clock()
和CLOCKS_PER_SEC
。这可能是一个坏主意,因为这是我的/bits/time.h
文件所说的:所以
CLOCKS_PER_SEC
可能被定义为 1000000,具体取决于您使用的选项编译,因此这似乎不是一个好的解决方案。A lot of answers have been suggesting
clock()
and thenCLOCKS_PER_SEC
fromtime.h
. This is probably a bad idea, because this is what my/bits/time.h
file says:So
CLOCKS_PER_SEC
might be defined as 1000000, depending on what options you use to compile, and thus it does not seem like a good solution.ANSI C 仅指定第二精度时间函数。但是,如果您在 POSIX 环境中运行,则可以使用 gettimeofday() 函数提供自 UNIX 纪元以来经过的时间的微秒分辨率。
作为旁注,我不建议使用clock(),因为它在许多(如果不是全部?)系统上实现得很糟糕并且不准确,此外它仅指您的程序在CPU上花费了多长时间并且不是程序的总生命周期,根据您的问题,我认为您想要测量它。
ANSI C only specifies second precision time functions. However, if you are running in a POSIX environment you can use the gettimeofday() function that provides microseconds resolution of time passed since the UNIX Epoch.
As a side note, I wouldn't recommend using clock() since it is badly implemented on many(if not all?) systems and not accurate, besides the fact that it only refers to how long your program has spent on the CPU and not the total lifetime of the program, which according to your question is what I assume you would like to measure.
您必须考虑到,测量程序执行的时间在很大程度上取决于机器在该特定时刻的负载。
知道了,在C中获取当前时间的方法可以通过不同的方式实现,一种更简单的方法是:
希望它有所帮助。
问候!
You have to take into account that measuring the time that took a program to execute depends a lot on the load that the machine has in that specific moment.
Knowing that, the way of obtain the current time in C can be achieved in different ways, an easier one is:
Hope it helps.
Regards!
我发现每个人都在这里推荐的通常的时钟(),由于某种原因在运行之间存在很大的偏差,即使对于没有任何副作用的静态代码,例如绘制到屏幕或读取文件。这可能是因为 CPU 改变了功耗模式、操作系统给出了不同的优先级等......
因此,每次使用 Clock() 可靠地获得相同结果的唯一方法是在循环中多次运行测量的代码(持续几分钟) ),采取预防措施防止编译器对其进行优化:现代编译器可以预先计算代码,而不会在循环中运行产生副作用,并将其移出循环。,就像每次迭代使用随机输入一样。
将足够的样本收集到数组中后,对该数组进行排序,并获取中间元素,称为中位数。中位数优于平均值,因为它消除了极端偏差,例如防病毒软件占用了所有 CPU 或操作系统进行了某些更新。
这是一个简单的实用程序,用于测量 C/C++ 代码的执行性能,对接近中值的值进行平均: https://github .com/saniv/gauge
我自己仍在寻找一种更强大、更快速的方法来测量代码。人们可能会尝试在没有任何操作系统的裸机上在受控条件下运行代码,但这会产生不切实际的结果,因为实际上操作系统确实参与其中。
x86 有这些硬件性能计数器,其中包括实际执行的指令数,但在没有操作系统帮助的情况下很难访问它们,难以解释并且有自己的问题( http://archive.gamedev.net/archive/reference/articles/article213.html )。尽管如此,它们仍然有助于调查瓶颈的性质(数据访问或对该数据的实际计算)。
I've found that the usual clock(), everyone recommends here, for some reason deviates wildly from run to run, even for static code without any side effects, like drawing to screen or reading files. It could be because CPU changes power consumption modes, OS giving different priorities, etc...
So the only way to reliably get the same result every time with clock() is to run the measured code in a loop multiple times (for several minutes), taking precautions to prevent the compiler from optimizing it out: modern compilers can precompute the code without side effects running in a loop, and move it out of the loop., like i.e. using random input for each iteration.
After enough samples are collected into an array, one sorts that array, and takes the middle element, called median. Median is better than average, because it throws away extreme deviations, like say antivirus taking up all CPU up or OS doing some update.
Here is a simple utility to measure execution performance of C/C++ code, averaging the values near median: https://github.com/saniv/gauge
I'm myself still looking for a more robust and faster way to measure code. One could probably try running the code in controlled conditions on bare metal without any OS, but that will give unrealistic result, because in reality OS does get involved.
x86 has these hardware performance counters, which including the actual number of instructions executed, but they are tricky to access without OS help, hard to interpret and have their own issues ( http://archive.gamedev.net/archive/reference/articles/article213.html ). Still they could be helpful investigating the nature of the bottle neck (data access or actual computations on that data).
每个解决方案都不适用于我的系统。
我可以使用
Every solution's are not working in my system.
I can get using
有些人可能会发现不同类型的输入有用:作为使用 NVidia CUDA 进行 GPGPU 编程的大学课程的一部分,我获得了这种测量时间的方法(课程描述)。它结合了之前帖子中看到的方法,我只是将其发布,因为要求赋予了它可信度:
我想您可以乘以例如
1.0 / 1000.0
来获得适合您需要的测量单位。Some might find a different kind of input useful: I was given this method of measuring time as part of a university course on GPGPU-programming with NVidia CUDA (course description). It combines methods seen in earlier posts, and I simply post it because the requirements give it credibility:
I suppose you could multiply with e.g.
1.0 / 1000.0
to get the unit of measurement that suits your needs.如果您的程序使用 GPU 或者使用
sleep()
,则clock()
diff 给出的持续时间小于实际持续时间。这是因为clock()
返回 CPU 时钟周期数。它只能用来计算CPU使用时间(CPU负载),而不能计算执行时长。我们不应该使用clock()来计算持续时间。在 C 中,我们仍然应该使用gettimeofday()
或clock_gettime()
来获取持续时间。If you program uses GPU or if it uses
sleep()
thenclock()
diff gives you smaller than actual duration. It is becauseclock()
returns the number of CPU clock ticks. It only can be used to calculate CPU usage time (CPU load), but not the execution duration. We should not use clock() to calculate duration. We still should usegettimeofday()
orclock_gettime()
for duration in C.perf 工具更准确地用于收集和分析正在运行的程序。使用
perf stat
显示与正在执行的程序相关的所有信息。perf tool is more accurate to be used in order to collect and profile the running program. Use
perf stat
to show all information related to the program being executed.使用类似函数的宏尽可能简单
As simple as possible by using function-like macro
冒泡排序和选择排序的执行时间对比
我有一个程序可以比较冒泡排序和选择排序的执行时间。
要找出代码块的执行时间,请通过
示例代码计算该块之前和之后的时间:
Comparison of execution time of bubble sort and selection sort
I have a program which compares the execution time of bubble sort and selection sort.
To find out the time of execution of a block of code compute the time before and after the block by
Example code: