如何测量函数执行所需的时间?

发布于 2024-07-04 23:41:08 字数 95 浏览 11 评论 0原文

如何测量函数执行所需的时间?

这是一个相对较短的函数,执行时间可能在毫秒范围内。

这个特定问题涉及用 C 或 C++ 编程的嵌入式系统。

How can you measure the amount of time a function will take to execute?

This is a relatively short function and the execution time would probably be in the millisecond range.

This particular question relates to an embedded system, programmed in C or C++.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

转身以后 2024-07-11 23:41:08

存在三种可能的解决方案:

硬件解决方案

使用处理器上的空闲输出引脚,并将示波器或逻辑分析仪连接到该引脚。 将引脚初始化为低电平状态,在调用要测量的函数之前,将引脚断言为高电平状态,并在从函数返回后立即取消断言引脚。


    *io_pin = 1;
    myfunc();
    *io_pin = 0;

书虫解决方案

如果函数相当小,并且您可以管理反汇编代码,则可以打开处理器架构数据手册并计算处理器执行每条指令所需的周期。 这将为您提供所需的周期数。
时间 = # 周期 * 处理器时钟速率 / 每条指令的时钟周期

对于较小的函数或用汇编程序编写的代码(例如 PIC 微控制器),这更容易做到

时间戳计数器解决方案

某些处理器具有时间戳计数器以快速速率递增(每隔几个处理器时钟滴答声)。 只需读取函数前后的时间戳即可。
这将为您提供经过的时间,但请注意您可能需要处理计数器翻转。

There are three potential solutions:

Hardware Solution:

Use a free output pin on the processor and hook an oscilloscope or logic analyzer to the pin. Initialize the pin to a low state, just before calling the function you want to measure, assert the pin to a high state and just after returning from the function, deassert the pin.


    *io_pin = 1;
    myfunc();
    *io_pin = 0;

Bookworm solution:

If the function is fairly small, and you can manage the disassembled code, you can crack open the processor architecture databook and count the cycles it will take the processor to execute every instructions. This will give you the number of cycles required.
Time = # cycles * Processor Clock Rate / Clock ticks per instructions

This is easier to do for smaller functions, or code written in assembler (for a PIC microcontroller for example)

Timestamp counter solution:

Some processors have a timestamp counter which increments at a rapid rate (every few processor clock ticks). Simply read the timestamp before and after the function.
This will give you the elapsed time, but beware that you might have to deal with the counter rollover.

哑剧 2024-07-11 23:41:08

在嵌入式系统上做到这一点的最佳方法是在进入该功能时设置外部硬件引脚,并在离开该功能时清除它。 最好使用一些汇编指令来完成此操作,这样您的结果就不会偏差太大。

编辑:好处之一是您可以在实际应用程序中执行此操作,并且不需要任何特殊的测试代码。 像这样的外部调试引脚是(应该是!)每个嵌入式系统的标准做法。

The best way to do that on an embedded system is to set an external hardware pin when you enter the function and clear it when you leave the function. This is done preferably with a little assembly instruction so you don't skew your results too much.

Edit: One of the benefits is that you can do it in your actual application and you don't need any special test code. External debug pins like that are (should be!) standard practice for every embedded system.

深居我梦 2024-07-11 23:41:08
start_time = timer
function()
exec_time = timer - start_time
start_time = timer
function()
exec_time = timer - start_time
花间憩 2024-07-11 23:41:08

我多次重复该函数调用(数百万次),但也采用以下方法来减少循环开销:

start = getTicks();

repeat n times {
    myFunction();
    myFunction();
}

lap = getTicks();

repeat n times {
    myFunction();
}

finish = getTicks();

// overhead + function + function
elapsed1 = lap - start;

// overhead + function
elapsed2 = finish - lap;

// overhead + function + function - overhead - function = function
ntimes = elapsed1 - elapsed2;

once = ntimes / n; // Average time it took for one function call, sans loop overhead

您可以在第一个循环中调用一次 function() ,而不是在第一个循环中调用 function() 两次,在第二个循环中调用一次。第一个循环并且在第二个循环中根本不调用它(即空循环),但是空循环可以由编译器优化,从而给您带来负面的计时结果:)

I repeat the function call a lot of times (millions) but also employ the following method to discount the loop overhead:

start = getTicks();

repeat n times {
    myFunction();
    myFunction();
}

lap = getTicks();

repeat n times {
    myFunction();
}

finish = getTicks();

// overhead + function + function
elapsed1 = lap - start;

// overhead + function
elapsed2 = finish - lap;

// overhead + function + function - overhead - function = function
ntimes = elapsed1 - elapsed2;

once = ntimes / n; // Average time it took for one function call, sans loop overhead

Instead of calling function() twice in the first loop and once in the second loop, you could just call it once in the first loop and don't call it at all (i.e. empty loop) in the second, however the empty loop could be optimized out by the compiler, giving you negative timing results :)

世态炎凉 2024-07-11 23:41:08

如果您使用的是 Linux,则可以通过在命令行中输入来计时程序的运行时间:

time [funtion_name]

如果您仅运行 main() 中的函数(假设是 C++),则应用程序的其余时间应该可以忽略不计。

if you're using linux, you can time a program's runtime by typing in the command line:

time [funtion_name]

if you run only the function in main() (assuming C++), the rest of the app's time should be negligible.

空城之時有危險 2024-07-11 23:41:08

在具有大量调用的循环中调用它,然后除以调用次数以获得平均时间。

所以:

// begin timing
for (int i = 0; i < 10000; i++) {
    invokeFunction();
}
// end time
// divide by 10000 to get actual time.

Invoke it in a loop with a ton of invocations, then divide by the number of invocations to get the average time.

so:

// begin timing
for (int i = 0; i < 10000; i++) {
    invokeFunction();
}
// end time
// divide by 10000 to get actual time.
浅笑依然 2024-07-11 23:41:08

Windows XP/NT Embedded 或 Windows CE/Mobile

您可以使用 QueryPerformanceCounter() 在函数之前和之后获取非常快速的计数器的值。 然后减去这些 64 位值并得到增量“刻度”。 使用 QueryPerformanceCounterFrequency() 您可以将“增量刻度”转换为实际时间单位。 您可以参考有关这些 WIN32 调用的 MSDN 文档。

其他嵌入式系统

没有操作系统或只有基本操作系统,您将必须:

  • 对内部 CPU 定时器之一进行编程以自由运行和计数。
  • 将其配置为在定时器溢出时生成中断,并在此中断例程中增加一个“进位”变量(这样您实际上可以测量比所选定时器的分辨率更长的时间)。
  • 在您的函数之前,您保存“进位”值和保存您配置的计数定时器的运行滴答的CPU寄存器的值。
  • 在你的函数
  • 减去它们以获得增量计数器刻度之后也是如此。
  • 从那里开始,只需知道在给定外部时钟和设置定时器时配置的去乘法的情况下,一个滴答声对您的 CPU/硬件意味着多长时间即可。 您将“刻度长度”乘以刚刚获得的“增量刻度”。

非常重要 不要忘记在获取这些计时器值(机器人进位和寄存器值)之后禁用中断并恢复中断,否则您可能会保存不正确的值。

注释

  • 这非常快,因为它只有几个汇编指令来禁用中断、保存两个整数值并重新启用中断。 实际的减法和转换为实时单位发生在时间测量区域之外,即在您的函数之后。
  • 您可能希望将该代码放入一个函数中以重用该代码,但由于函数调用和将所有寄存器推入堆栈以及参数,然后再次弹出它们,它可能会减慢速度。 在嵌入式系统中,这可能很重要。 使用 MACROS 或编写自己的汇编例程仅保存/恢复相关寄存器可能比在 C 中更好。

Windows XP/NT Embedded or Windows CE/Mobile

You an use the QueryPerformanceCounter() to get the value of a VERY FAST counter before and after your function. Then you substract those 64-bits values and get a delta "ticks". Using QueryPerformanceCounterFrequency() you can convert the "delta ticks" to an actual time unit. You can refer to MSDN documentation about those WIN32 calls.

Other embedded systems

Without operating systems or with only basic OSes you will have to:

  • program one of the internal CPU timers to run and count freely.
  • configure it to generate an interrupt when the timer overflows, and in this interrupt routine increment a "carry" variable (this is so you can actually measure time longer than the resolution of the timer chosen).
  • before your function you save BOTH the "carry" value and the value of the CPU register holding the running ticks for the counting timer you configured.
  • same after your function
  • substract them to get a delta counter tick.
  • from there it is just a matter of knowing how long a tick means on your CPU/Hardware given the external clock and the de-multiplication you configured while setting up your timer. You multiply that "tick length" by the "delta ticks" you just got.

VERY IMPORTANT Do not forget to disable before and restore interrupts after getting those timer values (bot the carry and the register value) otherwise you risk saving incorrect values.

NOTES

  • This is very fast because it is only a few assembly instructions to disable interrupts, save two integer values and re-enable interrupts. The actual substraction and conversion to real time units occurs OUTSIDE the zone of time measurement, that is AFTER your function.
  • You may wish to put that code into a function to reuse that code all around but it may slow things a bit because of the function call and the pushing of all the registers to the stack, plus the parameters, then popping them again. In an embedded system this may be significant. It may be better then in C to use MACROS instead or write your own assembly routine saving/restoring only relevant registers.
︶ ̄淡然 2024-07-11 23:41:08

在 OS X 终端(也可能是 Unix)中,使用“时间”:

time python function.py

In OS X terminal (and probably Unix, too), use "time":

time python function.py
花开浅夏 2024-07-11 23:41:08

我总是实现一个中断驱动的自动收报机例程。 然后,这会更新一个计数器,该计数器计算自启动以来的毫秒数。 然后使用 GetTickCount() 函数访问该计数器。

示例:

#define TICK_INTERVAL 1    // milliseconds between ticker interrupts
static unsigned long tickCounter;

interrupt ticker (void)  
{
    tickCounter += TICK_INTERVAL;
    ...
}

unsigned in GetTickCount(void)
{
    return tickCounter;
}

在您的代码中,您将按如下方式计时:

int function(void)
{
    unsigned long time = GetTickCount();

    do something ...

    printf("Time is %ld", GetTickCount() - ticks);
}

I always implement an interrupt driven ticker routine. This then updates a counter that counts the number of milliseconds since start up. This counter is then accessed with a GetTickCount() function.

Example:

#define TICK_INTERVAL 1    // milliseconds between ticker interrupts
static unsigned long tickCounter;

interrupt ticker (void)  
{
    tickCounter += TICK_INTERVAL;
    ...
}

unsigned in GetTickCount(void)
{
    return tickCounter;
}

In your code you would time the code as follows:

int function(void)
{
    unsigned long time = GetTickCount();

    do something ...

    printf("Time is %ld", GetTickCount() - ticks);
}
云淡月浅 2024-07-11 23:41:08

取决于您的嵌入式平台以及您正在寻找的计时类型。 对于嵌入式 Linux,有多种方法可以实现。 如果您希望测量函数使用的 CPU 时间,您可以执行以下操作:

#include <time.h>
#include <stdio.h>
#include <stdlib.h>

#define SEC_TO_NSEC(s) ((s) * 1000 * 1000 * 1000)

int work_function(int c) {
    // do some work here
    int i, j;
    int foo = 0;
    for (i = 0; i < 1000; i++) {
        for (j = 0; j < 1000; j++) {
            for ^= i + j;
        }
    }
}

int main(int argc, char *argv[]) {
    struct timespec pre;
    struct timespec post;
    clock_gettime(CLOCK_THREAD_CPUTIME_ID, &pre);
    work_function(0);
    clock_gettime(CLOCK_THREAD_CPUTIME_ID, &post);

    printf("time %d\n",
        (SEC_TO_NSEC(post.tv_sec) + post.tv_nsec) -
        (SEC_TO_NSEC(pre.tv_sec) + pre.tv_nsec));
    return 0;
}

您需要将其与实时库链接,只需使用以下内容来编译您的代码:

gcc -o test test.c -lrt

您可能还想阅读手册页clock_gettime 在基于 SMP 的系统上运行此代码会出现一些问题,这些问题可能会使您的测试无效。 您可以使用 sched_setaffinity() 或命令行 cpuset 来强制代码仅在一个内核上运行。

如果您希望测量用户和系统时间,那么您可以使用 times(NULL) ,它返回类似 jiffies 的内容。 或者,您可以将 clock_gettime() 的参数从 CLOCK_THREAD_CPUTIME_ID 更改为 CLOCK_MONOTONIC...但要小心 CLOCK_MONOTONIC< 的环绕/代码>。

对于其他平台,你就得靠自己了。

德鲁

Depends on your embedded platform and what type of timing you are looking for. For embedded Linux, there are several ways you can accomplish. If you wish to measure the amout of CPU time used by your function, you can do the following:

#include <time.h>
#include <stdio.h>
#include <stdlib.h>

#define SEC_TO_NSEC(s) ((s) * 1000 * 1000 * 1000)

int work_function(int c) {
    // do some work here
    int i, j;
    int foo = 0;
    for (i = 0; i < 1000; i++) {
        for (j = 0; j < 1000; j++) {
            for ^= i + j;
        }
    }
}

int main(int argc, char *argv[]) {
    struct timespec pre;
    struct timespec post;
    clock_gettime(CLOCK_THREAD_CPUTIME_ID, &pre);
    work_function(0);
    clock_gettime(CLOCK_THREAD_CPUTIME_ID, &post);

    printf("time %d\n",
        (SEC_TO_NSEC(post.tv_sec) + post.tv_nsec) -
        (SEC_TO_NSEC(pre.tv_sec) + pre.tv_nsec));
    return 0;
}

You will need to link this with the realtime library, just use the following to compile your code:

gcc -o test test.c -lrt

You may also want to read the man page on clock_gettime there is some issues with running this code on SMP based system that could invalidate you testing. You could use something like sched_setaffinity() or the command line cpuset to force the code on only one core.

If you are looking to measure user and system time, then you could use the times(NULL) which returns something like a jiffies. Or you can change the parameter for clock_gettime() from CLOCK_THREAD_CPUTIME_ID to CLOCK_MONOTONIC...but be careful of wrap around with CLOCK_MONOTONIC.

For other platforms, you are on your own.

Drew

Smile简单爱 2024-07-11 23:41:08

如果代码是 .Net,请使用秒表类 (.net 2.0+) 而不是 DateTime.Now。 DateTime.Now 更新不够准确,会给你带来疯狂的结果

If the code is .Net, use the stopwatch class (.net 2.0+) NOT DateTime.Now. DateTime.Now isn't updated accurately enough and will give you crazy results

亢潮 2024-07-11 23:41:08

如果您正在寻找亚毫秒级分辨率,请尝试以下计时方法之一。 它们都会在至少数十或数百微秒内为您提供解决方案:

如果是嵌入式 Linux,请查看 Linux 计时器:

http://linux.die.net/man/3/clock_gettime

嵌入式Java,看看nanoTime(),尽管我不确定这是在嵌入式版本中:

http://java.sun.com/ j2se/1.5.0/docs/api/java/lang/System.html#nanoTime()

如果您想获取硬件计数器,请尝试 PAPI:

http://icl.cs.utk.edu/papi/

否则你可以随时转到汇编程序。 如果您需要一些帮助,您可以查看您的架构的 PAPI 源代码。

If you're looking for sub-millisecond resolution, try one of these timing methods. They'll all get you resolution in at least the tens or hundreds of microseconds:

If it's embedded Linux, look at Linux timers:

http://linux.die.net/man/3/clock_gettime

Embedded Java, look at nanoTime(), though I'm not sure this is in the embedded edition:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#nanoTime()

If you want to get at the hardware counters, try PAPI:

http://icl.cs.utk.edu/papi/

Otherwise you can always go to assembler. You could look at the PAPI source for your architecture if you need some help with this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文