如何以微秒精度计算操作时间

发布于 2024-09-01 09:36:14 字数 655 浏览 3 评论 0原文

我想在Windows平台上以微秒精度计算函数的性能。

现在Windows本身就有毫秒级的粒度,那么我怎样才能实现这一点呢?

我尝试了以下示例,但没有得到正确的结果。

LARGE_INTEGER ticksPerSecond = {0};
LARGE_INTEGER tick_1 = {0};
LARGE_INTEGER tick_2 = {0};
double uSec = 1000000;

// Get the frequency
QueryPerformanceFrequency(&ticksPerSecond);

//Calculate per uSec freq
double uFreq = ticksPerSecond.QuadPart/uSec;

// Get counter b4 start of op
QueryPerformanceCounter(&tick_1);

// The ope itself
Sleep(10);

// Get counter after opfinished
QueryPerformanceCounter(&tick_2);

// And now the op time in uSec
double diff = (tick_2.QuadPart/uFreq) - (tick_1.QuadPart/uFreq);

I want to calculate performance of a function in micro second precision on Windows platform.

Now Windows itself has milisecond granuality, so how can I achieve this.

I tried following sample, but not getting correct results.

LARGE_INTEGER ticksPerSecond = {0};
LARGE_INTEGER tick_1 = {0};
LARGE_INTEGER tick_2 = {0};
double uSec = 1000000;

// Get the frequency
QueryPerformanceFrequency(&ticksPerSecond);

//Calculate per uSec freq
double uFreq = ticksPerSecond.QuadPart/uSec;

// Get counter b4 start of op
QueryPerformanceCounter(&tick_1);

// The ope itself
Sleep(10);

// Get counter after opfinished
QueryPerformanceCounter(&tick_2);

// And now the op time in uSec
double diff = (tick_2.QuadPart/uFreq) - (tick_1.QuadPart/uFreq);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

夜空下最亮的亮点 2024-09-08 09:36:14

循环运行该操作一百万次左右,并将结果除以该数字。这样您就可以获得多次执行的平均执行时间。由于多任务处理等原因,对一个非常快的操作执行一次(甚至一百次)进行计时是非常不可靠的。

Run the operation in a loop a million times or so and divide the result by that number. That way you'll get the average execution time over that many executions. Timing one (or even a hundred) executions of a very fast operation is very unreliable, due to multitasking and whatnot.

橘虞初梦 2024-09-08 09:36:14
  • 编译它,
  • 查看汇编器输出
  • 计数,函数中每条指令的数量,
  • 应用目标处理器上每条指令的周期
  • ,最终得到周期计数
  • 乘以正在运行的时钟速度,
  • 应用任意比例因子来考虑缓存未命中和分支错误预测哈哈

(伙计,我会因为这个答案而被否决)

  • compile it
  • look at the assembler output
  • count the number of each instruction in your function
  • apply the cycles per instruction on your target processor
  • end up with a cycle count
  • multiply by the clock speed you are running at
  • apply arbitrary scaling factors to account for cache misses and branch mis-predictions lol

(man I am so going to get down-voted for this answer)

不美如何 2024-09-08 09:36:14

不,您可能会得到准确的结果,QueryPerformanceCounter() 非常适合计时短间隔。问题在于您对 Sleep() 准确性的期望。它的分辨率为1毫秒,其精度要差得多。在大多数 Windows 机器上不超过 15.625 毫秒。

要使其接近 1 毫秒,您必须调用 timeBeginPeriod(1)。忽略 Windows 作为多任务操作系统带来的抖动,这可能会改善匹配度。

No, you are probably getting an accurate result, QueryPerformanceCounter() works well for timing short intervals. What's wrong is the your expectation of the accuracy of Sleep(). It has a resolution of 1 millisecond, its accuracy is far worse. No better than about 15.625 milliseconds on most Windows machine.

To get it anywhere close to 1 millisecond, you'll have to call timeBeginPeriod(1) first. That probably will improve the match, ignoring the jitter you'll get from Windows being a multi-tasking operating system.

夜清冷一曲。 2024-09-08 09:36:14

如果您要进行离线分析,一种非常简单的方法是运行该函数 1000 次,测量到最接近的毫秒并除以 1000。

If you're doing this for offline profiling, a very simple way is to run the function 1000 times, measure to the closest millisecond and divide by 1000.

梦毁影碎の 2024-09-08 09:36:14

要获得比 1 毫秒更精细的分辨率,您必须查阅操作系统文档。可能有 API 可以获取微秒分辨率的计时器分辨率。如果是这样,请多次运行您的应用程序并取平均值。

To get finer resolution than 1 ms, you will have to consult your OS documentation. There may be APIs to get timer resolution in microsecond resolution. If so, run your application many times and take the averages.

不回头走下去 2024-09-08 09:36:14

我喜欢 Matti Virkkunen 的回答。检查时间,多次调用该函数,完成后检查时间,然后除以调用该函数的次数。他确实提到你可能会因为操作系统中断而关闭。您可能会改变拨打电话的次数并看到差异。您能提高该流程的优先级吗?你能得到一个操作系统时间片内的所有调用吗?

由于您不知道操作系统何时会将您换出,因此您可以将所有这些都放在一个更大的循环中以多次执行整个测量,并保存最小的数字,因为这是操作系统中断最少的数字。这仍然可能大于函数执行的实际时间,因为它可能仍然包含一些操作系统中断。

I like Matti Virkkunen's answer. Check the time, call the function a large number of times, check the time when you finish, and divide by the number of times you called the function. He did mention you might be off due to OS interrupts. You might vary the number of times you make the call and see a difference. Can you raise the priority of the process? Can you get it so all the calls within a single OS time slice?

Since you don't know when the OS might swap you out, you can put this all inside a larger loop to do the whole measurement a large number of times, and save the smallest number as that is the one that had the fewest OS interrupts. This still may be greater than the actual time for the function to execute because it may still contain some OS interrupts.

表情可笑 2024-09-08 09:36:14

Sanjeet,

(在我看来)你做得完全正确。 QueryPerformanceCounter 是一种高精度测量短时间段的完美方法。如果您没有看到预期的结果,很可能是因为睡眠的时间没有达到您预期的时间!然而,它可能被正确测量。

我想回到您最初的问题,即如何以微秒精度测量 Windows 上的时间。如您所知,高性能计数器(即 QueryPerformanceCounter)以 QueryPerformanceFrequency 报告的频率“滴答”。这意味着您可以测量时间,其精度等于:

1/频率秒

在我的计算机上,QueryPerformanceFrequency 报告 2337910(计数/秒)。这意味着我的计算机的 QPC 可以测量精度为 4.277e-7 秒,即 0.427732 微秒。这意味着我可以测量的最小时间位是 0.427732 微秒。当然,这为您提供了最初要求的精度:)您的机器的频率应该相似,但您始终可以进行数学计算并检查它。

Sanjeet,

It looks (to me) like you're doing this exactly right. QueryPerformanceCounter is a perfectly good way to measure short periods of time with a high degree of precision. If you're not seeing the result you expected, it's most likely because the sleep isn't sleeping for the amount of time you expected it to! However, it is likely being measured correctly.

I want to go back to your original question about how to measure the time on windows with microsecond precision. As you already know, the high performance counter (i.e. QueryPerformanceCounter) "ticks" at the frequency reported by QueryPerformanceFrequency. That means that you can measure time with precision equal to:

1/frequency seconds

On my machine, QueryPerformanceFrequency reports 2337910 (counts/sec). That means that my computer's QPC can measure with precision 4.277e-7 seconds, or 0.427732 microseconds. That means that the smallest bit of time I can measure is 0.427732 microseconds. This, of course, gives you the precision that you originally asked for :) Your machine's frequency should be similar, but you can always do the math and check it.

墨洒年华 2024-09-08 09:36:14

或者您可以使用 gettimeofday() ,它为您提供一个 timeval 结构,它是一个时间戳(低至 µs)

Or you can use gettimeofday() which gives you a timeval struct that is a timestamp (down to µs)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文