在C++中使用乐观的锁和内存顺序
我想知道如何“乐观的锁”(如下所述:)可以用于非原子数据以及应使用哪些内存顺序的C ++。
我从上面的论文中理解的是,以下代码将同步 t1
和 t2
和 t1
将打印两个变量更新或无。
static constexpr uint64_t locked_bit = (1ULL << 63);
std::atomic<int> data1 = 0;
std::atomic<int> data2 = 0;
std::atomic<uint64_t> versioned_lock = 0;
int main() {
std::thread t1([&]{
restart:
auto s = versioned_lock.load();
if (s & locked_bit)
goto restart;
auto local_data1 = data1.load();
auto local_data2 = data2.load();
if (s == versioned_lock.load()) {
// data has not been overwritten yet, can safely use local_data
std::cout << local_data1 << " " << local_data2 << std::endl;
} else {
// possible race, local_data might be garbage?
}
});
std::thread t2([&]{
auto current_lock_version = versioned_lock.load();
current_lock_version++;
versioned_lock.store(current_lock_version | locked_bit);
data1.store(1234);
data2.store(4321);
versioned_lock.store(current_lock_version);
});
t1.join();
t2.join();
}
我的问题是:
- 我这是没有UB的有效C ++代码?
- 我对同步的假设是否正确?
T1
实际上会打印“ 0 0”或“ 1234 4321”? - 应该将哪些内存订单用于阅读/写作
versioned_lock
?它应该是memory_order_seq_cst
还是不那么限制的东西?如果data1
和data2
是非原子的(只有int
,甚至一些更复杂的数据类型),它是否会影响所需的必需品,它也可以正常工作。内存顺序(或者可能需要Atomic_thread_fence)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为是这样。在这样的程序中获得UB的通常方法是数据竞赛,
但这只有在编写非原子变量时才会发生
同时,此程序中的每个共享变量都是原子。
是的。此版本中的所有负载和存储都是
seq_cst
,因此以某种总顺序发生,我将使用&lt;表示。令L1,L2表示两个负载
versioned_lock
在t1中,s1,s2 t2中的两个商店。如果S1,S2均出现在L1之前,则将看到两个新值
我们打印
1234 4321
,这很好。如果S1,S2都在L2之后发生,那么我们打印
0 0
,也很好。如果S1或S2发生在之间L1和L2,然后L1和L2返回不同的值,我们不打印
任何东西,这再次很好。
剩下的唯一订购是S1&LT; L1&lt; L2&lt; S2。在这种情况下L1
用锁定位集返回值,因此我们重新启动循环
而不是打印。
i think 以下是足够的:
也就是说,访问
versioned_lock
清除并测试锁定的位必须释放和获取,数据的存储必须是
在释放围栏之前,数据的负载随后是
获取栅栏。直观地,这应该防止数据访问从保护它们的锁访问之间泄漏。 (您也可以制作数据的所有存储
释放然后放下释放栅栏,但这是不必要的
强:我们不在乎两个数据存储是否会重新排序
其他。如果我们有更多的数据元素,那将会更大。 。
为了证明这有效,请注意,我们必须避免的两个结果是
1234 0
和0 4321
因此,假设data1
的负载返回1234
(data2
返回4321
的情况是等效的)。由于这是由T2存储的值,因此释放围栏(RF)与获取围栏(AF)同步。现在,在RF之前对S1进行了测序,并且在L2之前对AF进行了测序,因此我们得出结论,S1发生在L2之前。因此L2无法返回0;它要么返回locked_bit
或1
。如果L1返回与L2不同的值,那么我们将不打印。如果L1返回
locked_bit
我们也不会打印。其余的情况是L1和L2都返回1。在这种情况下,L1读取了S2编写的值,L1/S2被获取/释放,因此S2与L1同步。数据存储在S2之前进行了测序,L1在数据负载之前进行了测序,因此两个数据存储都发生在两个数据负载之前。因此,我们看到了新值,并打印1234 4321
。不,那根本行不通。该算法中没有任何内容阻止
data1,data2
的存储和负载来自冲突;就是这样当他们发生冲突时,我们对版本的锁的测试告诉我们不要
使用数据。但是,如果它们不是原子,那么矛盾的商店和
负载将是一场数据竞赛,会导致不确定的行为
是否数据。
I think so. The usual way to get UB in such a program is a data race,
but that can only happen when a non-atomic variable is written
concurrently, and every shared variable in this program is atomic.
Yes. All the loads and stores in this version are
seq_cst
, so theyoccur in some total order, which I will use < to denote. Let L1, L2 denote the two loads of
versioned_lock
in t1, and S1, S2 the two stores in t2.If S1, S2 both occur before L1, then both the new values will be seen
and we print
1234 4321
, which is fine. If S1, S2 both occur after L2 then weprint
0 0
, which is also fine. If either S1 or S2 occurs in betweenL1 and L2, then L1 and L2 return different values and we do not print
anything, which again is fine.
The only ordering that remains is S1 < L1 < L2 < S2. In this case L1
returns the value with the locked bit set, and so we restart the loop
instead of printing.
I think that the following suffices:
That is, the accesses to
versioned_lock
that clear and test the lockedbit must be release and acquire, the stores of the data must be
preceded by a release fence, and the loads of the data followed by an
acquire fence. Intuitively, this should keep the data accesses from leaking out from between the lock accesses that protect them. (You could also make all the stores of the data
be release and then drop the release fence, but that would be unnecessarily
strong: we do not care if the two data stores get reordered with each
other. It would make a bigger difference if we had many more data elements. The same applies to the option of making the data loads acquire.)
To prove that this works, notice that the two results we must avoid are
1234 0
and0 4321
. So suppose that the load ofdata1
returns1234
(the case whendata2
returns4321
is equivalent). Since this is the value that was stored by t2, the release fence (RF) synchronizes with the acquire fence (AF). Now S1 is sequenced before RF, and AF is sequenced before L2, so we conclude that S1 happens before L2. Thus L2 cannot return 0; it either returnslocked_bit
or1
.If L1 returns a different value from L2 then we will not print. If L1 returns
locked_bit
we also will not print. The one remaining case is that L1 and L2 both return 1. In this case, L1 has read the value written by S2, and L1/S2 are acquire/release, so S2 synchronizes with L1. The data stores are sequenced before S2, and L1 is sequenced before the data loads, so both data stores happen before both data loads. Therefore we see both the new values, and print1234 4321
.No, that will not work at all. Nothing in this algorithm prevents the
stores and loads of
data1, data2
from conflicting; it's just thatwhen they do conflict, our tests of the versioned lock tell us not to
use the data. But if they were not atomic, the conflicting stores and
loads would be a data race, causing undefined behavior whether we use
the data or not.