使用 java 创建快速/可靠的基准测试？

发布于 2024-11-16 03:51:19 字数 1801 浏览 4 评论 0原文

我正在尝试使用 java 创建基准测试。目前我有以下简单的方法：

public static long runTest(int times){
    long start = System.nanoTime();     
    String str = "str";
    for(int i=0; i<times; i++){
        str = "str"+i;
    }       
    return System.nanoTime()-start;     
}

我目前在另一个多次发生的循环中多次执行此循环，并获取运行此方法所需的最小/最大/平均时间。然后我在另一个线程上开始一些活动并再次测试。基本上我只是想获得一致的结果...如果我有 runTest 循环 1000 万次，它看起来相当一致：

Number of times ran: 5
The max time was: 1231419504 (102.85% of the average)
The min time was: 1177508466 (98.35% of the average)
The average time was: 1197291937
The difference between the max and min is: 4.58%

Activated thread activity.

Number of times ran: 5
The max time was: 3872724739 (100.82% of the average)
The min time was: 3804827995 (99.05% of the average)
The average time was: 3841216849
The difference between the max and min is: 1.78%

Running with thread activity took 320.83% as much time as running without.

但这似乎有点多，并且需要一些时间...如果我尝试一个较低的数字（100000） runTest 循环...它开始变得非常不一致：

    Number of times ran: 5
    The max time was: 34726168 (143.01% of the average)
    The min time was: 20889055 (86.02% of the average)
    The average time was: 24283026
    The difference between the max and min is: 66.24%

    Activated thread activity.

    Number of times ran: 5
    The max time was: 143950627 (148.83% of the average)
    The min time was: 64780554 (66.98% of the average)
    The average time was: 96719589
    The difference between the max and min is: 122.21%

    Running with thread activity took 398.3% as much time as running without.

有没有一种方法可以让我做这样的基准测试，既一致又高效/快速？

顺便说一句，我没有测试开始时间和结束时间之间的代码。我正在以某种方式测试 CPU 负载（看看我如何启动一些线程活动并重新测试）。所以我认为我正在寻找一些东西来替代我在“runTest”中的代码，这将产生更快、更一致的结果。

谢谢

原文

I'm trying to create a benchmark test with java. Currently I have the following simple method:

public static long runTest(int times){
    long start = System.nanoTime();     
    String str = "str";
    for(int i=0; i<times; i++){
        str = "str"+i;
    }       
    return System.nanoTime()-start;     
}

I'm currently having this loop multiple times within another loop that is happening multiple times and getting the min/max/avg time it takes to run this method through. Then I am starting some activity on another thread and testing again. Basically I am just wanting to get consistent results... It seems pretty consistent if I have the runTest loop 10 million times:

Number of times ran: 5
The max time was: 1231419504 (102.85% of the average)
The min time was: 1177508466 (98.35% of the average)
The average time was: 1197291937
The difference between the max and min is: 4.58%

Activated thread activity.

Number of times ran: 5
The max time was: 3872724739 (100.82% of the average)
The min time was: 3804827995 (99.05% of the average)
The average time was: 3841216849
The difference between the max and min is: 1.78%

Running with thread activity took 320.83% as much time as running without.

But this seems a bit much, and takes some time... if I try a lower number (100000) in the runTest loop... it starts to become very inconsistent:

    Number of times ran: 5
    The max time was: 34726168 (143.01% of the average)
    The min time was: 20889055 (86.02% of the average)
    The average time was: 24283026
    The difference between the max and min is: 66.24%

    Activated thread activity.

    Number of times ran: 5
    The max time was: 143950627 (148.83% of the average)
    The min time was: 64780554 (66.98% of the average)
    The average time was: 96719589
    The difference between the max and min is: 122.21%

    Running with thread activity took 398.3% as much time as running without.

Is there a way that I can do a benchmark like this that is both consistent and efficient/fast?

I'm not testing the code that is between the start and end times by the way. I'm testing the CPU load in a way (see how I'm starting some thread activity and retesting). So I think that what I'm looking for it something to substitute for the code I have in "runTest" that will yield quicker and more consistent results.

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

↘人皮目录ツ 2024-11-23 03:51:19

简而言之：（

微）基准测试非常复杂，因此请使用基准测试框架之类的工具 http:// /www.ellipticgroup.com/misc/projectLibrary.zip - 仍然对结果持怀疑态度（“将微信任置于微基准中”，Cliff Click 博士）。

详细说明：

有很多因素会强烈影响结果：

System.nanoTime 的准确度和精确度：在最坏的情况下，它与 System.currentTimeMillis 一样糟糕。
代码预热和类加载
混合模式：JVM 仅在代码块被足够频繁地调用（1500 或 1000 次）后才进行 JIT 编译（参见 Edwin Buck 的回答）
动态优化：去优化、栈上替换、死代码消除（您应该使用你在循环中计算的结果，例如打印它）
资源回收：垃圾收集（参见 Michael Borgwardt 的答案）和对象最终化
I/O 和 CPU
缓存：操作系统上的整体：屏幕保护程序、电源管理、其他进程（索引器、病毒扫描等）

Brent Boyer 的文章“Robust Java benchmarking，第 1 部分：问题”( http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html）很好地描述了所有这些问题以及您是否/可以采取哪些措施它们（例如使用 JVM 选项或预先调用 ProcessIdleTask）。

您无法消除所有这些因素，因此进行统计是个好主意。但是：