普通的联合/查找算法在不需要任何额外工作的情况下线程安全吗？

发布于 2024-07-17 05:52:37 字数 1158 浏览 6 评论 0原文

标准的 union/find 或 Disjoint-set 数据结构具有非常好的运行时间（有效地对于单线程情况，O(1))。然而，它在多线程情况下的有效性/性能如何？我认为即使没有锁定或除了原子指针大小的写入之外的任何原子操作，它也是完全有效的。

有谁发现以下逻辑有问题吗？

首先，我假设指针大小的写入是原子的。由此看来，不难看出您可以在多个线程中安全地运行 find 函数，因为唯一会发生的更新都是设置为相同的值。如果您允许 find 函数返回调用时（而不是返回时）正确的答案，那么不难认为许多 find 和 a单个union可以同时运行； find 的参数不会改变，union 仅更新根，而 find 永远不会更新根。
至于剩下的情况（几个union），我认为这也有效，但我不太确定。

顺便说一句：我不要求解决方案与单线程版本一样高效。（为了避免锁/原子，我也愿意放弃全局一致的状态。）

编辑：再看一下，多联合的情况不起作用，因为如果一侧不是新的根与其他东西联合（也不是作为根），然后你可以将它从第二个联合的另一侧切掉。

A = find(a)  // union 1
B = find(b)  // union 1
----
X = find(x)  //   union 2 (x == a) -> (X == A) -> (X.par aliases A.par)
Y = find(y)  //   union 2
X.par = Y    //   union 2
----
A.par = B    // union 1

这可以通过 CAS 来回避：

while(!CAS(A == A.par, B)) A = find(A);

原文

The standard union/find or Disjoint-set data structure has very good running times (effectively O(1)) for single threaded cases. However what is it's validity/performance under multi-threaded cases? I think that it is totally valid even without locking or any atomic operations aside from atomic pointer sized writes.

Does anyone see any problems with the following logic?

First I will assume that pointer sized writes are atomic. From that, it's not hard to argue that you can safely run the find function in several threads as the only updates that will occur are all sets to the same value. If you allow the find function to return the answer that was true when it was called (as opposed to when it returned) it's not hard to argue that many finds and a single union can be run at the same time; the argument for the finds doesn't change and the union only updates roots and the finds never update roots.
As for the remaining case (several unions), I think that works as well but I'm not so sure of it.

BTW: I'm not requiring that the solution be just as efficient as the single threaded version. (In order to avoid locks/atomic, I'm also willing to discard globally coherent state.)

edit: taking another look, the many-union cases doesn't work because if the side that is not the new root is unioned with something else (also not as a root) then you could cut it off of other side of the second union.

A = find(a)  // union 1
B = find(b)  // union 1
----
X = find(x)  //   union 2 (x == a) -> (X == A) -> (X.par aliases A.par)
Y = find(y)  //   union 2
X.par = Y    //   union 2
----
A.par = B    // union 1

this can be sidestepped with a CAS:

while(!CAS(A == A.par, B)) A = find(A);

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

洋洋洒洒 2024-07-24 05:52:38

有点晚了，但供以后参考。通过原子操作，您可以构建不相交的集合数据结构。不变的是每个集合都由其最小成员表示，这可以避免由于竞争条件而导致的循环。

// atomic_compare_and_exchange(unsigned int *data, unsigned int new_value, unsigned int comparator)


// "The result used to be the root of v once"
static unsigned int GetSet(volatile unsigned int *H_RESTRICT set, const unsigned int v)
{
  unsigned int next;
  unsigned int root = v;
  unsigned int prev = v;

  while (root != (next = set[root]))
  {
    /* Update set[prev] to point to next instead of root.
      * If it was updated by some other thread in the meantime, or if this
      * is the first step (where set[prev]==next, not root), ignore. */
    atomic_compare_and_exchange(set + prev, next, root);

    prev = root;
    root = next;
  }

  /* Update the path to the root */

  return root;
}

// FALSE == "They used not to be in the same set"
// TRUE == "They are in the same set, and will be forever"
static HBOOL IsInSameSet(volatile unsigned int *H_RESTRICT set, unsigned int v1, unsigned int v2)
{
  do
  {
    v1 = GetSet(set, v1);
    v2 = GetSet(set, v2);
  } while (set[v1] != v1 || set[v2] != v2);

  return v1 == v2;
}

static void Union(volatile unsigned int *H_RESTRICT set, unsigned int v1, unsigned int v2)
{
  unsigned int result;

  do
  {
    v1 = GetSet(set, v1);
    v2 = GetSet(set, v2);
    if (v1 == v2)
    {
      /* Already same set. This cannot be changed by a parallel operation. */
      break;
    }
    if (v1 < v2)
    {
      /* Make sure we connect the larger to the smaller set. This also avoids
       * cyclic reconnections in case of race conditions. */
      unsigned int tmp = v1;
      v1 = v2;
      v2 = tmp;
    }

    /* Make v1 point to v2 */
    result = atomic_compare_and_exchange(set + v1, v2, v1);
  } while (result != v1);
}

A little late but for future reference. With atomic operations, you can build a disjoint set data structure. The invariant is that each set is represented by its smallest member, which allows avoiding loops due to race conditions.

// atomic_compare_and_exchange(unsigned int *data, unsigned int new_value, unsigned int comparator)


// "The result used to be the root of v once"
static unsigned int GetSet(volatile unsigned int *H_RESTRICT set, const unsigned int v)
{
  unsigned int next;
  unsigned int root = v;
  unsigned int prev = v;

  while (root != (next = set[root]))
  {
    /* Update set[prev] to point to next instead of root.
      * If it was updated by some other thread in the meantime, or if this
      * is the first step (where set[prev]==next, not root), ignore. */
    atomic_compare_and_exchange(set + prev, next, root);

    prev = root;
    root = next;
  }

  /* Update the path to the root */

  return root;
}

// FALSE == "They used not to be in the same set"
// TRUE == "They are in the same set, and will be forever"
static HBOOL IsInSameSet(volatile unsigned int *H_RESTRICT set, unsigned int v1, unsigned int v2)
{
  do
  {
    v1 = GetSet(set, v1);
    v2 = GetSet(set, v2);
  } while (set[v1] != v1 || set[v2] != v2);

  return v1 == v2;
}

static void Union(volatile unsigned int *H_RESTRICT set, unsigned int v1, unsigned int v2)
{
  unsigned int result;

  do
  {
    v1 = GetSet(set, v1);
    v2 = GetSet(set, v2);
    if (v1 == v2)
    {
      /* Already same set. This cannot be changed by a parallel operation. */
      break;
    }
    if (v1 < v2)
    {
      /* Make sure we connect the larger to the smaller set. This also avoids
       * cyclic reconnections in case of race conditions. */
      unsigned int tmp = v1;
      v1 = v2;
      v2 = tmp;
    }

    /* Make v1 point to v2 */
    result = atomic_compare_and_exchange(set + v1, v2, v1);
  } while (result != v1);
}

回复收藏 0 原文