计算例程的速度?
确定处理例程(例如函数过程)所需时间的最佳且最准确的方法是什么?
我问这个问题是因为我目前正在尝试优化应用程序中的一些功能,当我测试更改时,很难仅通过查看来确定是否有任何改进。因此,如果我可以返回处理例程所需的准确或接近准确的时间,那么我就可以更清楚地了解代码是否进行了任何更改,效果如何。
我考虑过使用 GetTickCount,但我不确定这是否接近准确?
如果有一个可重复使用的函数/过程来计算例程的时间,并像这样使用它会很有用:
// < prepare for calcuation of code
...
ExecuteSomeCode; // < code to test
...
// < stop calcuating code and return time it took to process
我期待听到一些建议。
谢谢。
克雷格.
What would be the best and most accurate way to determine how long it took to process a routine, such as a procedure of function?
I ask because I am currently trying to optimize a few functions in my Application, when i test the changes it is hard to determine just by looking at it if there was any improvements at all. So if I could return an accurate or near accurate time it took to process a routine, I then have a more clear idea of how well, if any changes to the code have been made.
I considered using GetTickCount, but I am unsure if this would be anything near accurate?
It would be useful to have a resuable function/procedure to calculate the time of a routine, and use it something like this:
// < prepare for calcuation of code
...
ExecuteSomeCode; // < code to test
...
// < stop calcuating code and return time it took to process
I look forward to hearing some suggestions.
Thanks.
Craig.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
据我所知,最准确的方法是使用 QueryPerformanceFrequency :
代码:
From my knowledge, the most accurate method is by using QueryPerformanceFrequency:
code:
尝试 Eric Grange 的采样分析器。
Try Eric Grange's Sampling Profiler.
从 Delphi 6 开始,您可以使用 x86 时间戳计数器。
这对 CPU 周期进行计数,在 1 Ghz 处理器上,每次计数需要一纳秒。
没有比这更准确的了。
在 x64 上,以下代码更准确,因为它不会受到
CPUID
延迟的影响。使用上面的代码获取执行代码之前和之后的时间戳。
最准确的方法可能而且简单易行。
请注意,您需要运行测试至少 10 次才能获得良好的结果,第一次通过时缓存将变冷,并且随机硬盘读取和中断可能会影响您的计时。
因为这个东西非常准确,如果您只为第一次运行计时,它可能会给您带来错误的想法。
为什么不应该使用 QueryPerformanceCounter()
如果 CPU 速度减慢,
QueryPerformanceCounter()
会提供相同的时间,从而补偿 CPU 限制。如果您的 CPU 由于过热或其他原因而变慢,RDTSC 将为您提供相同数量的周期。因此,如果您的 CPU 开始运行过热并需要降低速度,
QueryPerformanceCounter()
会说您的例程花费了更多时间(这是误导性的),而 RDTSC 会说它需要相同数量的周期(这是准确的)。这就是您想要的,因为您感兴趣的是代码使用的 CPU 周期数,而不是挂钟时间。
来自最新的英特尔文档:http://software.intel.com/en-us/articles/measure-code-sections-using-the-enhanced-timer/?wapkw=%28rdtsc%29
何时不使用 RDTSC
RDTSC 对于基本计时很有用。如果您要在单 CPU 计算机上对多线程代码进行计时,RDTSC 将正常工作。如果您有多个 CPU,则起始计数可能来自一个 CPU,而结束计数可能来自另一个 CPU。
因此,不要使用 RDTSC 在多 CPU 计算机上对多线程代码进行计时。在单 CPU 机器上它工作得很好,或者在多 CPU 机器上单线程代码也很好。
另请记住,RDTSC 会计算 CPU 周期。如果有一些需要时间但不使用 CPU 的东西,比如磁盘 IO 或网络,那么 RDTSC 就不是一个好的工具。
但是文档说 RDTSC 在现代 CPU 上并不准确
RDTSC 不是一个跟踪时间的工具,而是一个跟踪 CPU 周期的工具。
为此,它是唯一准确的工具。跟踪时间的例程在现代 CPU 上并不准确,因为 CPU 时钟不像以前那样是绝对的。
From Delphi 6 upwards you can use the x86 Timestamp counter.
This counts CPU cycles, on a 1 Ghz processor, each count takes one nanosecond.
Can't get more accurate than that.
On x64 the following code is more accurate, because it does not suffer from the delay of
CPUID
.Use the above code to get the timestamp before and after executing your code.
Most accurate method possible and easy as pie.
Note that you need to run a test at least 10 times to get a good result, on the first pass the cache will be cold, and random harddisk reads and interrupts can throw off your timings.
Because this thing is so accurate it can give you the wrong idea if you only time the first run.
Why you should not use QueryPerformanceCounter()
QueryPerformanceCounter()
gives the same amount of time if the CPU slows down, it compensates for CPU thottling. Whilst RDTSC will give you the same amount of cycles if your CPU slows down due to overheating or whatnot.So if your CPU starts running hot and needs to throttle down,
QueryPerformanceCounter()
will say that your routine is taking more time (which is misleading) and RDTSC will say that it takes the same amount of cycles (which is accurate).This is what you want because you're interested in the amount of CPU-cycles your code uses, not the wall-clock time.
From the lastest intel docs: http://software.intel.com/en-us/articles/measure-code-sections-using-the-enhanced-timer/?wapkw=%28rdtsc%29
When not to use RDTSC
RDTSC is useful for basic timing. If you're timing multithreaded code on a single CPU machine, RDTSC will work fine. If you have multiple CPU's the startcount may come from one CPU and the endcount from another.
So don't use RDTSC to time multithreaded code on a multi-CPU machine. On a single CPU machine it works fine, or single threaded code on a multi-CPU machine it is also fine.
Also remember that RDTSC counts CPU cycles. If there is something that takes time but doesn't use the CPU, like disk-IO or network than RDTSC is not a good tool.
But the documentation says RDTSC is not accurate on modern CPU's
RDTSC is not a tool for keeping track of time, it's a tool for keeping track of CPU-cycles.
For that it is the only tool that is accurate. Routines that keep track of time are not accurate on modern CPU's because the CPU-clock is not absolute like it used to be.
您没有指定您的 Delphi 版本,但 Delphi XE 在单元诊断中声明了一个 TStopWatch。这将使您能够以合理的精度测量运行时间。
You didn't specify your Delphi version, but Delphi XE has a TStopWatch declared in unit Diagnostics. This will allow you to measure the runtime with reasonable precision.
人们很自然地认为测量是找出要优化的内容的方法,但还有更好的方法。
如果某件事需要足够长的时间(F)来值得优化,那么如果您只是随机暂停它,F 就是您在行为中捕获它的概率。
这样做几次,你就会明白为什么要这么做,甚至到具体的代码行。
更多信息。
这是一个示例。
修复它,然后进行整体测量一下你节省了多少钱,应该是 F 左右。
冲洗并重复。
It is natural to think that measuring is how you find out what to optimize, but there's a better way.
If something takes a large enough fraction of time (F) to be worth optimizing, then if you simply pause it at random, F is the probability you will catch it in the act.
Do that several times, and you will see precisely why it's doing it, down to the exact lines of code.
More on that.
Here's an example.
Fix it, and then do an overall measurement to see how much you saved, which should be about F.
Rinse and repeat.
以下是我为处理检查函数的持续时间而编写的一些程序。我将它们放在一个名为
uTesting
的单元中,然后在测试期间将其放入 use 子句中。声明
声明的变量
实现
Here are some procedures I made to handle checking the duration of a function. I stuck them in a unit I called
uTesting
and then just throw into the uses clause during my testing.Declaration
variables declared
Implementation
使用此 http://delphi.about.com /od/windowsshellapi/a/delphi-high-performance-timer-tstopwatch.htm
Use this http://delphi.about.com/od/windowsshellapi/a/delphi-high-performance-timer-tstopwatch.htm
clock_gettime()
是高级解决方案,精确到纳秒,您还可以使用rtdsc
,精确到 CPU 周期,最后您可以简单地使用gettimeofday()
。clock_gettime()
is the high solution, which is precise to nano seconds, you can also usertdsc
, which is precise to CPU cycle, and lastly you can simply usegettimeofday()
.