互锁增量性能
对于各种平台上的 int 和 long,Interlocked.Increment(ref x)
比 x++
更快还是更慢?
Is Interlocked.Increment(ref x)
faster or slower than x++
for ints and longs on various platforms?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
它速度较慢,因为它强制操作以原子方式发生,并且充当内存屏障,消除了处理器围绕指令重新排序内存访问的能力。
当您希望操作在可以在线程之间共享的状态上是原子的时,您应该使用 Interlocked.Increment - 它并不是要完全替代 x++。
It is slower since it forces the action to occur atomically and it acts as a memory barrier, eliminating the processor's ability to re-order memory accesses around the instruction.
You should be using Interlocked.Increment when you want the action to be atomic on state that can be shared between threads - it's not intended to be a full replacement for x++.
根据我们的经验,InterlockedIncrement() 等对 Windows 的影响相当大。 在一个示例案例中,我们能够消除互锁并使用 ++/-- 代替。 仅此一项就将运行时间从 140 秒减少到 110 秒。 我的分析是互锁强制内存往返(否则其他核心怎么能看到它?)。 L1 缓存读/写大约需要 10 个时钟周期,但内存读/写大约需要 100 个时钟周期。
在这个示例案例中,我估计递增/递减操作的数量约为 10 亿次。 因此,在 2Ghz CPU 上,++/-- 大约需要 5 秒,互锁大约需要 50 秒。 将差异分散到多个线程中,接近 30 秒。
In our experience the InterlockedIncrement() et al on Windows are quite significant impacts. In one sample case we were able to eliminate the interlock and use ++/-- instead. This alone reduced run time from 140 seconds to 110 seconds. My analysis is that the interlock forces a memory roundtrip (otherwise how could other cores see it?). An L1 cache read/write is around 10 clock cycles, but a memory read/write more like 100.
In this sample case, I estimated the number of increment/decrement operations at about 1 billion. So on a 2Ghz CPU this is something like 5 seconds for the ++/--, and 50 seconds for the interlock. Spread the difference across several threads, and its close to 30 seconds.
想一想,您就会意识到
Increment
调用不可能比增量运算符的简单应用更快。 如果是,那么编译器的增量运算符的实现将在内部调用Increment
,并且它们将执行相同的操作。但是,正如您通过亲自测试所看到的,它们的性能并不相同。
这两个选项有不同的目的。 一般使用增量运算符。 当您需要原子操作并且确定该变量的所有其他用户也在使用互锁操作时,请使用增量。 (如果他们不全部合作,那么这并没有真正的帮助。)
Think about it for a moment, and you'll realize an
Increment
call cannot be any faster than a simple application of the increment operator. If it were, then the compiler's implementation of the increment operator would callIncrement
internally, and they'd perform the same.But, as you can see by testing it for yourself, they don't perform the same.
The two options have different purposes. Use the increment operator generally. Use
Increment
when you need the operation to be atomic and you're sure all other users of that variable are also using interlocked operations. (If they're not all cooperating, then it doesn't really help.)速度比较慢。 然而,这是我所知道的在标量变量上实现线程安全的最高效的通用方法。
It's slower. However, it's the most performant general way I know of for achieving thread safety on scalar variables.
它总是会更慢,因为它必须执行 CPU 总线锁定而不是仅仅更新寄存器。 然而,现代 CPU 实现了接近寄存器的性能,因此即使在实时处理中,它也可以忽略不计。
It will always be slower because it has to perform a CPU bus lock vs just updating a register. However modern CPUs achieve near register performance so it's negligible even in real-time processing.
我的性能测试:
易失性:65,174,400
锁定:62,428,600
互锁:113,248,900
My perfomance test:
volatile: 65,174,400
lock: 62,428,600
interlocked: 113,248,900