如何在没有竞争条件的情况下将两个 32 位计数器读取为 64 位整数
内存0x100和0x104是两个32位计数器。它们代表一个 64 位定时器并且不断递增。
如何正确读取两个内存地址并将时间存储为 64 位整数?
一种不正确的解决方案:(
x = High
y = Low
result = x << 32 + y
该程序可以被换出,同时低溢出...)
附加要求:
仅使用 C,无汇编
总线是 32 位的,因此无法在一条指令中读取它们。
您的程序可能随时切换上下文。
没有可用的互斥体或锁。
一些高级的解释是可以的。代码不需要。谢谢!
At memory 0x100 and 0x104 are two 32-bit counters. They represent a 64-bit timer and are constantly incrementing.
How do I correctly read from two memory addresses and store the time as a 64-bit integer?
One incorrect solution:
x = High
y = Low
result = x << 32 + y
(The program could be swapped out and in the meantime Low overflows...)
Additional requirements:
Use C only, no assembly
The bus is 32-bit, so no way to read them in one instruction.
Your program may get context switched at any time.
No mutex or locks available.
Some high-level explanation is okay. Code not necessary. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我从 David L. Mills 那里学到了这一点,他将其归功于 Leslie Lamport:
假设计时器本身自动更新,那么这保证可以工作 - 如果 L 在步骤 1 和步骤 1 之间的某个地方溢出2,那么 H 将在步骤 1 和 3 之间递增,并且步骤 4 中的测试将失败。
I learned this from David L. Mills, who attributes it to Leslie Lamport:
Assuming that the timer itself updates atomically then this is guaranteed to work -- if L overflowed somewhere between steps 1 and 2, then H will have incremented between steps 1 and 3, and the test in step 4 will fail.
除了已经说过的内容之外,您不会获得比中断/上下文切换抖动允许的更准确的时序读取。如果您担心计时器轮询期间发生中断/上下文切换,解决方案不是采用一些奇怪的读取-读取-读取-比较算法,也不是使用内存屏障或信号量。
解决方案是对定时器使用硬件中断,并带有执行时不能被中断的中断服务程序。如果您确实需要的话,这将提供尽可能高的准确性。
In addition to what has already been said, you won't get more accurate timing reads than your interrupt / context switch jitter allows. If you fear an interrupt / context switch in the middle of a timer polling, the solution is not to adapt some strange read-read-read-compare algorithm, nor is it to use memory barriers or semaphores.
The solution is to use a hardware interrupt for the timer, with an interrupt service routine that cannot be interrupted when executed. This will give the highest possible accuracy, if you actually have need of such.
如果可以保证上下文切换的最大时间明显小于低字翻转周期的一半,则可以使用该事实来确定低值是在翻转之前还是之后读取,并相应地选择正确的高字。
这避免了其他解决方案的“重复”阶段。
If you can guarantee that the maximum time of context switch is significantly less than half the low word rollover period, you can use that fact to decide whether the Low value was read before or after its rollover, and choose the correct high word accordingly.
This avoids the 'repeat' phase of other solutions.
Hobbs 和 jkerian 已经给出了显而易见且可能有意为之的答案:
在某些多 CPU/核心硬件上,这不会实际上并不能正常工作。除非您有内存屏障来确保您不会从自己的核心缓存中读取高值和低值,然后从另一个核心进行更新 - 即使是 64 位原子并刷新到某些共享内存 - 不能保证在您的核心中及时可见。虽然
High
和Low
必须是易失性
限定的,但这还不够。更新频率越高,由此问题导致的错误的可能性就越大,也就越严重。
如果没有一些用于操作系统/CPU 特定内存屏障、互斥体、原子操作等的 C 包装器,就没有可移植的方法来做到这一点。
下面布鲁克斯的评论提到这确实适用于某些 CPU,例如现代 AMD。< /强>
The obvious and presumably intended answer is already given by Hobbs and jkerian:
On some multi-CPU/core hardware, this doesn't actually work properly. Unless you have a memory barrier to ensure that you're not reading High and Low from your own core's cache, then updates from another core - even if 64-bit atomic and flushed to some shared memory - aren't guaranteed to be visible in your core a timely fashion. While
High
andLow
must bevolatile
-qualified, this is not sufficient.The higher the frequency of updates, the more probable and significant the errors due to this issue.
There is no portable way to do this without some C wrappers for OS/CPU-specific memory barriers, mutexes, atomic operations etc..
Brooks' comment below mentions that this does work for certain CPUs, such as modern AMDs.
鉴于内存(计时器)的性质,您应该能够读取 A、读取 B、读取 A' 并将 A 与 A' 进行比较,如果它们匹配,您就会得到答案。否则重复。
这在某种程度上取决于该内存还有哪些其他限制。如果它类似于系统时钟,上面的代码将处理
0x0000FFFF
转到0x00010000
的情况,并且根据您读取它的顺序,否则您会错误地最终为0x00000000
或0x0001FFFF
。Given the nature of the memory (a timer), you should be able to read A, read B, read A' and compare A to A', if they match you have your answer. Otherwise repeat.
It sortof depends on what other constraints there are on this memory. If it's something like a system-clock, the above will handle the situation where
0x0000FFFF
goes to0x00010000
, and, depending on the order you read it in, you would otherwise erroneously end up with0x00000000
or0x0001FFFF
.问题陈述不包括计数器是否可以在读取之间多次滚动所有 64 位。因此,我可能会尝试交替读取这两个 32 位单词几千次,如果需要的话,可以读取更多次,将它们存储在 2 个向量数组中,对两个向量运行模 2^32 的线性回归拟合,并将该比率的斜率匹配约束应用于可能的结果,然后使用估计的回归拟合将计数值预测回所需的参考时间。
The problem statement didn't include whether the counters could roll over all 64-bits several times between reads. So I might try alternating reading both 32-bit words a few thousand times, more if needed, store them in 2 vector arrays, run a linear regression fit modulo 2^32 against both vectors, and apply slope matching contraints of that ratio to the possible results, then use the estimated regression fit to predict the count value back to the desired reference time.