对一段独立于 CPU 性能的代码进行基准测试?
我的目标是:我想测试一段代码(或函数)的性能,就像我在单元测试中测试该函数的正确性一样,假设这个基准测试过程的输出是一个“功能性能指数”,它是“可移植的”
我的问题是:我们通常通过使用计时器来计算代码执行期间经过的时间来对代码进行基准测试。该方法取决于硬件或操作系统或其他东西。
我的问题是:是否有一种方法可以获得独立于主机性能(CPU/OS/等)的“功能性能指数”,或者如果不是“独立”的话可以说它与某个固定值“相关”。因此,“功能性能指数”的值在任何平台或硬件性能上仍然有效。
例如:FPI 值可以
- 通过与基准函数相比执行单个调用
- 浮点值所需的算术指令数来衡量,例如函数 B 的评级指数为 1.345(即性能比基准函数慢 1.345 倍) )
- 或其他值。
请注意,FPI 值不需要在科学上是正确的、精确的或准确的,我只需要一个值来粗略地概述该函数的性能与测试过的其他函数的比较同样的方法。
My Objective is : I want to test a piece of code (or function) performance, just like how I test the correctness of that function in a unit-test, let say that the output of this benchmarking process is a "function performance index" which is "portable"
My Problem is : we usually benchmarking a code by using a timer to count elapsed time during execution of that code. and that method is depend on the hardware or O/S or other thing.
My Question is : is there a method to get a "function performance index" that is independent to the performance of the host (CPU/OS/etc..), or if not "independent" lets say it is "relative" to some fixed value. so that somehow the value of "function performance index" is still valid on any platform or hardware performance.
for example: that FPI value is could be measured in
- number of arithmetic instruction needed to execute a single call
- float value compared to benchmark function, for example function B has rating index of 1.345 (which is the performance is slower 1.345 times than the benchmark function)
- or other value.
note that the FPI value doesn't need to be scientifically correct, exact or accurate, I just need a value to give a rough overview of that function performance compared to other function which was tested by the same method.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我认为你在这里寻找不可能的东西,因为现代计算机的性能是 CPU、缓存、内存控制器、内存等的复杂混合。
因此,一个(假设的)计算机系统可能会奖励使用大量查找表格来简化算法,因此处理的CPU指令非常少。而另一个系统的内存可能比 CPU 核心慢得多,因此执行大量处理但触及很少内存的算法将受到青睐。
因此,这两种算法的单一“品质因数”甚至无法表明哪个算法在所有系统上更好,更不用说好多少了。
I think you are in search of the impossible here, because the performance of a modern computer is a complex blend of CPU, cache, memory controller, memory, etc.
So one (hypothetical) computer system might reward the use of enormous look-up tables to simplify an algorithm, so that there were very few cpu instructions processed. Whereas another system might have memory much slower relative to the CPU core, so an algorithm which did a lot of processing but touched very little memory would be favoured.
So a single 'figure of merit' for these two algorithms could not even convey which was the better one on all systems, let alone by how much it was better.
也许您真正需要的是类似 tcov 的工具。
man tcov 说:
每个基本代码块(或每个
如果指定了 tcov 的 -a 选项,则该行)的前缀为
已执行的次数;线路有
未执行的前缀为“#####”。一个基本块
是没有分支的连续代码部分:每个
基本块中的语句执行次数相同
次。
Probably what you really need is a tcov-like tool.
man tcov says:
Each basic block of code (or each
line if the -a option to tcov is specified) is prefixed with
the number of times it has been executed; lines that have
not been executed are prefixed with "#####". A basic block
is a contiguous section of code that has no branches: each
statement in a basic block is executed the same number of
times.
不,没有这样的事情。不同的硬件表现不同。您可以有两个不同的代码段 X 和 Y,这样硬件 A 的运行速度比 Y 快,但硬件 B 的运行 Y 比 X 快。性能没有绝对的范围,它完全取决于硬件(更不用说其他东西了,比如操作系统和其他环境考虑因素)。
No, there is no such thing. Different hardware performs differently. You can have two different pieces of code X and Y such that hardware A runs X faster than Y but hardware B runs Y faster than X. There is no absolute scale of performance, it depends entirely on the hardware (not to mention other things like the operating system and other environmental considerations).
听起来你想要的是一个计算一段代码的 Big-O Notation 的程序。我不知道是否可以以自动化的方式做到这一点(停止问题等)。
It sounds like what you want is a program that calculates the Big-O Notation of a piece of code. I don't know if it's possible to do that in an automated fashion (Halting problem, etc).
正如其他人提到的那样,这不是一项微不足道的任务,并且可能无法从中获得任何准确的结果。考虑几种方法:
Like others have mentioned this is not a trivial task and may be impossible to get any sort of accurate results from. Considering a few methods: