使用 gcc 原子内置函数的原子交换函数

发布于 2024-12-28 16:11:38 字数 950 浏览 3 评论 0原文

这是通用原子交换函数的正确实现吗？我正在 GCC 上寻找与 C++03 兼容的解决方案。

template<typename T>
void atomic_swap(T & a, T & b) {
    static_assert(sizeof(T) <= sizeof(void*), "Maximum size type exceeded.");
    T * ptr = &a;
    b =__sync_lock_test_and_set(ptr, b);
    __sync_lock_release(&ptr);
}

如果没有，我应该做什么来修复它？

另外：__sync_lock_release总是必要的吗？当搜索其他代码库时，我发现这通常不会被调用。如果没有发布调用，我的代码如下所示：

template<typename T>
void atomic_swap(T & a, T & b) {
    static_assert(sizeof(T) <= sizeof(void*), "Maximum size type exceeded.");
    b = __sync_lock_test_and_set(&a, b);
}

PS： GNU C++ 中的原子交换是类似的问题，但它没有回答我的问题，因为提供的答案需要 C++11 的 std::atomic 并且它具有签名 Data *swap_data(Data *new_data) 这对于 swap 函数来说似乎根本没有意义。（它实际上将提供的参数与在函数之前定义的全局变量交换。）

原文

Is this a correct implementation for a generic atomic swap function? I'm looking for a C++03-compatible solution on GCC.

template<typename T>
void atomic_swap(T & a, T & b) {
    static_assert(sizeof(T) <= sizeof(void*), "Maximum size type exceeded.");
    T * ptr = &a;
    b =__sync_lock_test_and_set(ptr, b);
    __sync_lock_release(&ptr);
}

If not, what should I do to fix it?

Also: is the __sync_lock_release always necessary? When searching through other codebases I found that this is often not called. Without the release call my code looks like this:

template<typename T>
void atomic_swap(T & a, T & b) {
    static_assert(sizeof(T) <= sizeof(void*), "Maximum size type exceeded.");
    b = __sync_lock_test_and_set(&a, b);
}

PS: Atomic swap in GNU C++ is a similar question but it doesn't answer my question because the provided answer requires C++11's std::atomic and it has signature Data *swap_data(Data *new_data) which doesn't seem to make sense at all for a swap function. (It actually swaps the provided argument with a global variable that was defined before the function.)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

玻璃人 2025-01-04 16:11:38

请记住，此版本的交换不是完全原子操作。虽然 b 的值将自动复制到 a 中，但 a 的值可能会复制对 b 值的另一个修改 由另一个线程。换句话说，对 b 的赋值相对于其他线程来说不是原子的。因此，您最终可能会遇到 a == 1 和 b == 2 的情况，并且在内置 gcc 之后，您最终会得到 a == 2 并返回 1 的值，但现在另一个线程已将 b 的值更改为 3，并且您可以用以下值覆盖 b 中的值<代码>1。因此，虽然您可能“技术上”交换了这些值，但您并没有以原子方式执行此操作...另一个线程在 gcc 原子内置返回之间触及了 b 的值，并且将该返回值分配给b。从汇编的角度来看，您有如下情况：

lea RAX, qword ptr [RDI]  // T * ptr = &a;
mov RCX, qword ptr [RSI]  // copy out the value referenced by b into a register
xchg [RAX], RCX           // __sync_lock_test_and_set(&a, b)
mov qword ptr [RSI], RCX  // place the exchange value back into b (not atomic!!)

说实话，如果没有像 DCAS 或弱加载链接/这样的硬件操作，您就无法对两个单独的内存位置进行无锁原子交换。条件存储，或者可能使用其他一些方法，例如事务内存（其本身倾向于使用细粒度锁定）。

其次，正如您现在编写的函数一样，如果您希望原子操作同时具有获取和释放语义，那么是的，您必须将其放在 __sync_lock_release 中，或者您'我们必须通过 __sync_synchronize 添加完整的内存屏障。否则，它将仅具有 __sync_lock_test_and_set 上的获取语义。尽管如此，它并没有自动地相互交换两个单独的内存位置......

Keep in mind this version of swap is not a fully atomic operation. While the value of b will be atomically copied into a, the value of a may copy over another modification to the value of b by another thread. In other words the assignment to b is not atomic with respect to other threads. Thus you could end up with a situation where a == 1, and b == 2, and after the gcc built-in, you end up with a == 2 and the value of 1 being returned, but now another thread has changed the value of b to 3, and you write over that value in b with the value of 1. So while you may have "technically" swapped the values, you didn't do it atomically ... another thread touched the value of b in-between the return from the gcc atomic built-in, and the assignment of that return value to b. Looked at from the assembly stand-point, you have something like the following:

lea RAX, qword ptr [RDI]  // T * ptr = &a;
mov RCX, qword ptr [RSI]  // copy out the value referenced by b into a register
xchg [RAX], RCX           // __sync_lock_test_and_set(&a, b)
mov qword ptr [RSI], RCX  // place the exchange value back into b (not atomic!!)

To be honest, you can't do a lock-free atomic swap of two separate memory locations without a hardware operation like a DCAS or a weak load-linked/store-conditional, or possibly using some other method like transactional memory (which itself tends to use fine-grained locking).

Secondly, as your function is written right now, if you want your atomic operation to have both acquire and release semantics, then yes, you're going to have to either place in the __sync_lock_release, or you're going to have to add a full memory barrier through __sync_synchronize. Otherwise it will only have acquire semantics on the __sync_lock_test_and_set. Still though, it does not atomically swap two separate memory locations with each other ...

回复收藏 0 原文

~没有更多了~