对一段独立于 CPU 性能的代码进行基准测试?

发布于 2024-10-02 07:44:10 字数 554 浏览 9 评论 0原文

我的目标是:我想测试一段代码(或函数)的性能,就像我在单元测试中测试该函数的正确性一样,假设这个基准测试过程的输出是一个“功能性能指数”,它是“可移植的”

我的问题是:我们通常通过使用计时器来计算代码执行期间经过的时间来对代码进行基准测试。该方法取决于硬件或操作系统或其他东西。

我的问题是:是否有一种方法可以获得独立于主机性能(CPU/OS/等)的“功能性能指数”,或者如果不是“独立”的话可以说它与某个固定值“相关”。因此,“功能性能指数”的值在任何平台或硬件性能上仍然有效。

例如:FPI 值可以

  • 通过与基准函数相比执行单个调用
  • 浮点值所需的算术指令数来衡量,例如函数 B 的评级指数为 1.345(即性能比基准函数慢 1.345 倍) )
  • 或其他值。

请注意,FPI 值不需要在科学上是正确的、精确的或准确的,我只需要一个值来粗略地概述该函数的性能与测试过的其他函数的比较同样的方法。

My Objective is : I want to test a piece of code (or function) performance, just like how I test the correctness of that function in a unit-test, let say that the output of this benchmarking process is a "function performance index" which is "portable"

My Problem is : we usually benchmarking a code by using a timer to count elapsed time during execution of that code. and that method is depend on the hardware or O/S or other thing.

My Question is : is there a method to get a "function performance index" that is independent to the performance of the host (CPU/OS/etc..), or if not "independent" lets say it is "relative" to some fixed value. so that somehow the value of "function performance index" is still valid on any platform or hardware performance.

for example: that FPI value is could be measured in

  • number of arithmetic instruction needed to execute a single call
  • float value compared to benchmark function, for example function B has rating index of 1.345 (which is the performance is slower 1.345 times than the benchmark function)
  • or other value.

note that the FPI value doesn't need to be scientifically correct, exact or accurate, I just need a value to give a rough overview of that function performance compared to other function which was tested by the same method.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

倾城°AllureLove 2024-10-09 07:44:10

我认为你在这里寻找不可能的东西,因为现代计算机的性能是 CPU、缓存、内存控制器、内存等的复杂混合。

因此,一个(假设的)计算机系统可能会奖励使用大量查找表格来简化算法,因此处理的CPU指令非常少。而另一个系统的内存可能比 CPU 核心慢得多,因此执行大量处理但触及很少内存的算法将受到青睐。

因此,这两种算法的单一“品质因数”甚至无法表明哪个算法在所有系统上更好,更不用说好多少了。

I think you are in search of the impossible here, because the performance of a modern computer is a complex blend of CPU, cache, memory controller, memory, etc.

So one (hypothetical) computer system might reward the use of enormous look-up tables to simplify an algorithm, so that there were very few cpu instructions processed. Whereas another system might have memory much slower relative to the CPU core, so an algorithm which did a lot of processing but touched very little memory would be favoured.

So a single 'figure of merit' for these two algorithms could not even convey which was the better one on all systems, let alone by how much it was better.

2024-10-09 07:44:10

也许您真正需要的是类似 tcov 的工具。

man tcov 说:

每个基本代码块(或每个
如果指定了 tcov 的 -a 选项,则该行)的前缀为
已执行的次数;线路有
未执行的前缀为“#####”。一个基本块
是没有分支的连续代码部分:每个
基本块中的语句执行次数相同
次。

Probably what you really need is a tcov-like tool.

man tcov says:

Each basic block of code (or each
line if the -a option to tcov is specified) is prefixed with
the number of times it has been executed; lines that have
not been executed are prefixed with "#####". A basic block
is a contiguous section of code that has no branches: each
statement in a basic block is executed the same number of
times.

阳光下的泡沫是彩色的 2024-10-09 07:44:10

不,没有这样的事情。不同的硬件表现不同。您可以有两个不同的代码段 X 和 Y,这样硬件 A 的运行速度比 Y 快,但硬件 B 的运行 Y 比 X 快。性能没有绝对的范围,它完全取决于硬件(更不用说其他东西了,比如操作系统和其他环境考虑因素)。

No, there is no such thing. Different hardware performs differently. You can have two different pieces of code X and Y such that hardware A runs X faster than Y but hardware B runs Y faster than X. There is no absolute scale of performance, it depends entirely on the hardware (not to mention other things like the operating system and other environmental considerations).

缘字诀 2024-10-09 07:44:10

听起来你想要的是一个计算一段代码的 Big-O Notation 的程序。我不知道是否可以以自动化的方式做到这一点(停止问题等)。

It sounds like what you want is a program that calculates the Big-O Notation of a piece of code. I don't know if it's possible to do that in an automated fashion (Halting problem, etc).

素手挽清风 2024-10-09 07:44:10

正如其他人提到的那样,这不是一项微不足道的任务,并且可能无法从中获得任何准确的结果。考虑几种方法:

  1. 基准函数——虽然这看起来很有希望,但我认为当您尝试比较不同类型的函数时,您会发现它不会很好地工作。例如,如果您的基准测试函数是 100% CPU 限制(如在某些复杂的数学计算中),那么它将与其他 CPU 限制函数进行比较/缩放,但与 I/O 或内存限制函数相比时会失败。仔细地将基准函数与一小组相似函数进行匹配可能有效,但很乏味/耗时。
  2. 指令数——对于一个非常简单的处理器,可以计算每条指令的周期数,并获得一个代码块所需的总周期数的合理值,但对于当今的现代处理器来说绝不是“简单”。借助分支预测和并行管道,您不能仅将指令周期相加并期望获得准确的结果。
  3. 手动计数——这可能是您最好的选择,虽然它不是自动的,但它可能比其他方法更快地给出更好的结果。只需查看代码的 O() 顺序、函数读/写多少内存、输入/输出多少文件字节等...通过为每个函数/模块提供一些这样的统计数据,您应该能够粗略地比较它们的复杂性。

Like others have mentioned this is not a trivial task and may be impossible to get any sort of accurate results from. Considering a few methods:

  1. Benchmark Functions -- While this seems promising I think you'll find that it won't work well as you try to compare different types of functions. For example, if your benchmark function is 100% CPU bound (as in some complex math computation) then it will compare/scale well with other CPU bound functions but fail when compared with, say, I/O or memory bound functions. Carefully matching a benchmark function to a small set of similar functions may work but is tedious/time consuming.
  2. Number of Instructions -- For a very simple processor it may be possible to count the cycles of each instruction and get a reasonable value for the total number of cycles a block of code will take but with today's modern processors are anything but "simple". With branch prediction and parallel pipelines you can can't just add up instruction cycles and expect to get an accurate result.
  3. Manual Counting -- This might be your best bet and while it is not automatic it may give better results faster than the other methods. Just look at things like the O() order of the code, how much memory the function reads/writes, how many file bytes are input/output etc.... By having a few stats like this for each function/module you should be able to get a rough comparison of their complexity.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文