使用 OpenMP 原子操作进行获取和添加

发布于 2024-09-29 13:45:06 字数 589 浏览 1 评论 0原文

我正在使用 OpenMP 并且需要使用获取和添加操作。但是，OpenMP 不提供适当的指令/调用。我想保留最大的可移植性，因此我不想依赖编译器内在函数。

相反，我正在寻找一种利用 OpenMP 原子操作来实现此目的的方法，但我已经陷入了死胡同。这还可以吗？注意，以下代码几乎满足了我的要求：

#pragma omp atomic
x += a

几乎 - 但不完全是，因为我确实需要 x 的旧值。 fetch_and_add 应该定义为产生与以下相同的结果（仅非锁定）：（

template <typename T>
T fetch_and_add(volatile T& value, T increment) {
    T old;
    #pragma omp critical
    {
        old = value;
        value += increment;
    }
    return old;
}

可以针对比较和交换提出等效问题，但可以根据另一个来实现，如果我没记错的话。）

原文

I’m using OpenMP and need to use the fetch-and-add operation. However, OpenMP doesn’t provide an appropriate directive/call. I’d like to preserve maximum portability, hence I don’t want to rely on compiler intrinsics.

Rather, I’m searching for a way to harness OpenMP’s atomic operations to implement this but I’ve hit a dead end. Can this even be done? N.B., the following code almost does what I want:

#pragma omp atomic
x += a

Almost – but not quite, since I really need the old value of x. fetch_and_add should be defined to produce the same result as the following (only non-locking):

template <typename T>
T fetch_and_add(volatile T& value, T increment) {
    T old;
    #pragma omp critical
    {
        old = value;
        value += increment;
    }
    return old;
}

(An equivalent question could be asked for compare-and-swap but one can be implemented in terms of the other, if I’m not mistaken.)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜味拾荒者 2024-10-06 13:45:06

从 openmp 3.1 开始，支持捕获原子更新，您可以捕获旧值或新值。因为无论如何我们都必须从内存中引入值来增加它，所以我们应该能够从 CPU 寄存器访问它并将其放入线程私有变量中才有意义。

如果您使用 gcc（或 g++），有一个很好的解决方法，查找原子内置函数：
http://gcc.gnu.org/onlinedocs/gcc -4.1.2/gcc/Atomic-Builtins.html

据说Intel的C/C++编译器也支持这个，但我没有尝试过。

目前（直到实现 openmp 3.1），我在 C++ 中使用了内联包装函数，您可以在编译时选择要使用的版本：

template <class T>
inline T my_fetch_add(T *ptr, T val) {
  #ifdef GCC_EXTENSION
  return __sync_fetch_and_add(ptr, val);
  #endif
  #ifdef OPENMP_3_1
  T t;
  #pragma omp atomic capture
  { t = *ptr; *ptr += val; }
  return t;
  #endif
}

更新：我刚刚尝试了 Intel 的 C++ 编译器，它目前支持 openmp 3.1（原子捕获）已实施）。英特尔在 Linux 中免费提供其编译器用于非商业目的：

http://software.intel.com/en-us/articles/non-commercial-software-download/

GCC 4.7 将支持 openmp 3.1，当它最终发布时......希望很快:)

As of openmp 3.1 there is support for capturing atomic updates, you can capture either the old value or the new value. Since we have to bring the value in from memory to increment it anyways, it only makes sense that we should be able to access it from say, a CPU register and put it into a thread-private variable.

There's a nice work-around if you're using gcc (or g++), look up atomic builtins:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

It think Intel's C/C++ compiler also has support for this but I haven't tried it.

For now (until openmp 3.1 is implemented), I've used inline wrapper functions in C++ where you can choose which version to use at compile time:

template <class T>
inline T my_fetch_add(T *ptr, T val) {
  #ifdef GCC_EXTENSION
  return __sync_fetch_and_add(ptr, val);
  #endif
  #ifdef OPENMP_3_1
  T t;
  #pragma omp atomic capture
  { t = *ptr; *ptr += val; }
  return t;
  #endif
}

Update: I just tried Intel's C++ compiler, it currently has support for openmp 3.1 (atomic capture is implemented). Intel offers free use of its compilers in linux for non-commercial purposes:

http://software.intel.com/en-us/articles/non-commercial-software-download/

GCC 4.7 will support openmp 3.1, when it eventually is released... hopefully soon :)

回复收藏 0 原文

燕归巢 2024-10-06 13:45:06

如果您想获取 x 的旧值并且 a 未更改，请使用 (xa) 作为旧值：

fetch_and_add(int *x, int a) {
 #pragma omp atomic
 *x += a;

 return (*x-a);
}

更新：这不是真正的答案，因为 x 可以在原子后由另一个线程修改。
因此，使用 OMP Pragmas 来实现通用的“获取并添加”似乎是不可能的。我所说的通用是指操作，可以从 OMP 代码的任何位置轻松使用。

您可以使用 omp_*_lock 函数来模拟原子：

typedef struct { omp_lock_t lock; int 值；}atomic_simulated_t；

fetch_and_add(atomic_simulated_t *x, int a)
{
  int ret;
  omp_set_lock(x->lock);
  x->value +=a;
  ret = x->value;
  omp_unset_lock(x->lock);
}

这是丑陋且缓慢的（执行 2 个原子操作而不是 1 个）。但是，如果您希望代码非常可移植，那么它并不是在所有情况下都是最快的。

你说“如下（仅非锁定）”。但是“非锁定”操作（使用CPU的“LOCK”前缀，或LL/SC等）和锁定操作（通过几个原子指令本身实现，用于解锁短暂等待的繁忙循环和操作系统睡眠）之间有什么区别长时间等待）？

If you want to get old value of x and a is not changed, use (x-a) as old value:

fetch_and_add(int *x, int a) {
 #pragma omp atomic
 *x += a;

 return (*x-a);
}

UPDATE: it was not really an answer, because x can be modified after atomic by another thread.
So it's seems to be impossible to make universal "Fetch-and-add" using OMP Pragmas. As universal I mean operation, which can be easily used from any place of OMP code.

You can use omp_*_lock functions to simulate an atomics:

typedef struct { omp_lock_t lock; int value;} atomic_simulated_t;

fetch_and_add(atomic_simulated_t *x, int a)
{
  int ret;
  omp_set_lock(x->lock);
  x->value +=a;
  ret = x->value;
  omp_unset_lock(x->lock);
}

This is ugly and slow (doing a 2 atomic ops instead of 1). But If you want your code to be very portable, it will be not the fastest in all cases.

You say "as the following (only non-locking)". But what is the difference between "non-locking" operations (using CPU's "LOCK" prefix, or LL/SC or etc) and locking operations (which are implemented itself with several atomic instructions, busy loop for short wait of unlock and OS sleeping for long waits)?

回复收藏 0 原文

~没有更多了~