bitoperations.ispow2基准不平衡
我正在基准测试方法 bitoperations.ispow2 带有benchmarkDotnet
,但我得到的结果远非我的期望。这是您需要重现基准的最小代码:
powerof2bench.cs
using System.Numerics;
using BenchmarkDotNet.Attributes;
public class PowerOf2Benchmark {
[Params(2048, 10003457, 20000123, 16777216)]
public int n;
[Benchmark]
public bool CheckWithBitOperationsBuiltIn()
{
return BitOperations.IsPow2(n);
}
}
program.cs
using BenchmarkDotNet.Running;
BenchmarkRunner.Run<PowerOf2Benchmark>();
,这是基准的摘要:
BenchmarkDotNet=v0.13.1, OS=ubuntu 20.04
Intel Core i5-6200U CPU 2.30GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.202
[Host] : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
DefaultJob : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
| Method | n | Mean | Error | StdDev | Code Size |
|------------------------------ |--------- |----------:|----------:|----------:|----------:|
| CheckWithBitOperationsBuiltIn | 2048 | 0.0955 ns | 0.0092 ns | 0.0081 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 10003457 | 1.1815 ns | 0.0046 ns | 0.0040 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 16777216 | 0.1000 ns | 0.0054 ns | 0.0051 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 20000123 | 1.1750 ns | 0.0126 ns | 0.0112 ns | 28 B |
// * Hints *
Outliers
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (2.33 ns)
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (3.38 ns)
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (3.42 ns)
我希望我正确解释了结果,但是但是似乎bitoperations.ispow2
与不相比的n
是2(2048,16777216)的功率时(10003457,200000000123) )。为什么那是
的源代码bitoperations.ispow2
应该是这样的:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsPow2(int value) => (value & (value - 1)) == 0 && value > 0;
我还假设Aggressive> AggressiveInling
优化不是结果不平衡结果的原因。我也不是硬件优化的专家,但是生产的ASM代码非常简单(由于methodimploptions.gagertiverinlining
>):
; CheckIfNumberIsPowerOf2.PowerOf2Benchmark.CheckWithBitOperationsBuiltIn()
push rbp
mov rbp,rsp
mov eax,[rdi+8]
lea edi,[rax-1]
test edi,eax
jne short M00_L01
test eax,eax
setg al
movzx eax,al
M00_L00:
pop rbp
ret
M00_L01:
xor eax,eax
jmp short M00_L00
; Total bytes of code 28
编辑
出于好奇,我有 方法的类
static public class PowerOf2Verifier
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static public bool CheckWithBitMaskV3(int n) => (n & (n - 1)) == 0 && n > 0;
}
。
[Benchmark]
public bool CheckWithBitMaskV3()
{
return PowerOf2Verifier.CheckWithBitMaskV3(n);
}
创建了一个
| Method | n | Mean | Error | StdDev | Code Size |
|------------------------------ |--------- |----------:|----------:|----------:|----------:|
| CheckWithBitMaskV3 | 2048 | 0.5141 ns | 0.0098 ns | 0.0087 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 2048 | 0.1040 ns | 0.0085 ns | 0.0079 ns | 28 B |
| CheckWithBitMaskV3 | 10003457 | 0.3589 ns | 0.0091 ns | 0.0081 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 10003457 | 1.1824 ns | 0.0091 ns | 0.0081 ns | 28 B |
| CheckWithBitMaskV3 | 16777216 | 0.5143 ns | 0.0063 ns | 0.0059 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 16777216 | 0.0991 ns | 0.0076 ns | 0.0071 ns | 28 B |
| CheckWithBitMaskV3 | 20000123 | 0.4513 ns | 0.0190 ns | 0.0177 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 20000123 | 1.1257 ns | 0.0108 ns | 0.0090 ns | 28 B |
包含具有相同实现的 情况,checkwithbitmaskv3
结果是一致的,这使我更加惊讶,因为我正在基准测试的两种方法现在以相同的方式实现。有什么解释?
编辑2
出于某种原因,checkwithbitmaskv3
的组装与ispow2
的组件略有不同:
; CheckIfNumberIsPowerOf2.PowerOf2Benchmark.CheckWithBitMaskV3()
push rbp
mov rbp,rsp
mov eax,[rdi+8]
lea edi,[rax-1]
test edi,eax
jne short M00_L00
test eax,eax
setg al
movzx eax,al
jmp short M00_L01
M00_L00:
xor eax,eax
M00_L01:
pop rbp
ret
; Total bytes of code 28
I am benchmarking the method BitOperations.IsPow2
with BenchmarkDotNet
, but the results I got were far from my expectations. Here is the minimal code you need to reproduce the benchmark:
PowerOf2Benchmark.cs
using System.Numerics;
using BenchmarkDotNet.Attributes;
public class PowerOf2Benchmark {
[Params(2048, 10003457, 20000123, 16777216)]
public int n;
[Benchmark]
public bool CheckWithBitOperationsBuiltIn()
{
return BitOperations.IsPow2(n);
}
}
Program.cs
using BenchmarkDotNet.Running;
BenchmarkRunner.Run<PowerOf2Benchmark>();
And here is the summary of the benchmark:
BenchmarkDotNet=v0.13.1, OS=ubuntu 20.04
Intel Core i5-6200U CPU 2.30GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.202
[Host] : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
DefaultJob : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
| Method | n | Mean | Error | StdDev | Code Size |
|------------------------------ |--------- |----------:|----------:|----------:|----------:|
| CheckWithBitOperationsBuiltIn | 2048 | 0.0955 ns | 0.0092 ns | 0.0081 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 10003457 | 1.1815 ns | 0.0046 ns | 0.0040 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 16777216 | 0.1000 ns | 0.0054 ns | 0.0051 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 20000123 | 1.1750 ns | 0.0126 ns | 0.0112 ns | 28 B |
// * Hints *
Outliers
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (2.33 ns)
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (3.38 ns)
PowerOf2Benchmark.CheckWithBitOperationsBuiltIn: Default -> 1 outlier was removed (3.42 ns)
I hope I interpreted the results correctly, but it seems that BitOperations.IsPow2
is more than 10x faster when n
is a power of 2 (2048, 16777216) compared to when it is not (10003457, 20000123). Why is that?
The source code of BitOperations.IsPow2
should be this one:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsPow2(int value) => (value & (value - 1)) == 0 && value > 0;
I am also assuming the AggressiveInlining
optimization is not the reason for the unbalanced results. I am not an expert on hardware optimizations either, but the ASM code produced is quite simple (it is inlined because of MethodImplOptions.AggressiveInlining
):
; CheckIfNumberIsPowerOf2.PowerOf2Benchmark.CheckWithBitOperationsBuiltIn()
push rbp
mov rbp,rsp
mov eax,[rdi+8]
lea edi,[rax-1]
test edi,eax
jne short M00_L01
test eax,eax
setg al
movzx eax,al
M00_L00:
pop rbp
ret
M00_L01:
xor eax,eax
jmp short M00_L00
; Total bytes of code 28
EDIT
Out of curiosity, I have created a class containing a method with the same implementation of BitOperations.IsPow2
:
static public class PowerOf2Verifier
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static public bool CheckWithBitMaskV3(int n) => (n & (n - 1)) == 0 && n > 0;
}
Then, in the PowerOf2Benchmark
class, I have added this method:
[Benchmark]
public bool CheckWithBitMaskV3()
{
return PowerOf2Verifier.CheckWithBitMaskV3(n);
}
This is the updated summary:
| Method | n | Mean | Error | StdDev | Code Size |
|------------------------------ |--------- |----------:|----------:|----------:|----------:|
| CheckWithBitMaskV3 | 2048 | 0.5141 ns | 0.0098 ns | 0.0087 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 2048 | 0.1040 ns | 0.0085 ns | 0.0079 ns | 28 B |
| CheckWithBitMaskV3 | 10003457 | 0.3589 ns | 0.0091 ns | 0.0081 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 10003457 | 1.1824 ns | 0.0091 ns | 0.0081 ns | 28 B |
| CheckWithBitMaskV3 | 16777216 | 0.5143 ns | 0.0063 ns | 0.0059 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 16777216 | 0.0991 ns | 0.0076 ns | 0.0071 ns | 28 B |
| CheckWithBitMaskV3 | 20000123 | 0.4513 ns | 0.0190 ns | 0.0177 ns | 28 B |
| CheckWithBitOperationsBuiltIn | 20000123 | 1.1257 ns | 0.0108 ns | 0.0090 ns | 28 B |
In this case, CheckWithBitMaskV3
results are consistent, and this surprises me even more because the two methods I am benchmarking are implemented in the same way now. What could be the explanation?
EDIT 2
For some reason, the assembly of CheckWithBitMaskV3
is slightly different from that of IsPow2
:
; CheckIfNumberIsPowerOf2.PowerOf2Benchmark.CheckWithBitMaskV3()
push rbp
mov rbp,rsp
mov eax,[rdi+8]
lea edi,[rax-1]
test edi,eax
jne short M00_L00
test eax,eax
setg al
movzx eax,al
jmp short M00_L01
M00_L00:
xor eax,eax
M00_L01:
pop rbp
ret
; Total bytes of code 28
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论