使用 Parallel.ForEach 在最小值中选择最小值

发布于 2024-09-10 20:51:22 字数 1272 浏览 7 评论 0原文

一般来说，我对 C#、Parallel.ForEach 和 .NET 很陌生。我想要并行化涉及数千个位置的搜索。对于每个位置，我计算大圆距离。这是我想扩展到不同核心的计算。我的问题是，如果我只有一个线程局部变量，如 MSDN TPL 示例？对于结果，我查看了 Interlocked，并看到了它的选项 Add、CompareExchange、Decrement、交换、递增和读取，但我不仅仅是添加、递增、递减或测试是否相等。我想通过并行运行的多个线程返回具有最短总体距离的对象。我的直觉告诉我这应该很容易，我应该能够创建一些包含 Location 和距离的小对象，但是我如何从每个线程中捕获最佳答案，然后 /em> 选择其中最短的距离？这是非并行版本：

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  foreach (Location aLoc in allLocations)
  {
    if (aLoc != myLocation)
    {
      double d = greatCircle(myLocation, aLoc);
      if (d < closest)
      {
        closest = d;
        closestLoc = aLoc;
      }
    }
  }
  return closestLoc;
}

我确实看到了 DDJ 博客文章这似乎提供了很好的建议，但我想知道这是否是最好的建议。我看到作者循环遍历数组，想知道是否没有更实用的方法来执行此操作。在函数世界中，我会使用 map、lambda 和 min。

原文

I am new to C#, Parallel.ForEach, and .NET in general. I want to parallelize a search that involves thousands of locations. For each location, I compute great circle distance. That is a calculation I want to spread to different cores. My question is how do I do this if I only have one thread-local variable, as in this MSDN TPL example? For the result, I looked at Interlocked, and saw its options Add, CompareExchange, Decrement, Exchange, Increment and Read, but I'm not just adding, incrementing, decrementing, or testing for equality. I want to return the object, over several threads running in parallel, that has the shortest overall distance. My gut says this should be easy, that I should be able to create some little object that wraps a Location and a distance, but how do I capture the best answer from each thread and then choose the shortest distance among them? Here is the non-parallel version:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  foreach (Location aLoc in allLocations)
  {
    if (aLoc != myLocation)
    {
      double d = greatCircle(myLocation, aLoc);
      if (d < closest)
      {
        closest = d;
        closestLoc = aLoc;
      }
    }
  }
  return closestLoc;
}

I did see a DDJ Blog Post that seemed to offer good advice, but I wondered if it was the best advice. I see the author looping over arrays, and wonder if there isn't a more functional way of doing this. In the functional world I would use map, lambda and min.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

十二 2024-09-17 20:51:22

这里最简单的选择是切换到 PLINQ：

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
     return allLocations
               .AsParallel()
               .Min(location => greatCircle(myLocation, location));
}

话虽这么说，这基本上只是具有并行结构的聚合。如果您想坚持使用 Parallel 类，您有几个选择。一种选择是使用锁定在块内自行同步。我不推荐这样做，因为它会损害你的整体表现。

更好的选择是使用 Parallel.ForEach 方法，它提供了当地州。他们将允许您将其重写为：

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  object sync = new object();

  Parallel.ForEach<Location, Tuple<double,Location>(
      allLocations,
      () => new Tuple(double.MaxValue, null),
      (location, loopState, localState) =>
      {
          double d = greatCircle(myLocation, aLoc);
          if (d < localState.Item1)
              return new Tuple(d, aLoc);
          else
              return localState;
      },
      localState =>
      {
          lock(sync)
          {
              if (localState.Item1 < closest)
              {
                  closest = localState.Item1;
                  closestLoc = localState.Item2;
              }
          }
      }
  );
  return closestLoc;
}

我使用我的博客上详细介绍了聚合的本地状态。这基本上将操作更改为每个线程一个锁定操作，而不是每个处理元素一个锁定，因此您获得的吞吐量比简单的锁定解决方案高得多。

The easiest option here would be to switch to PLINQ:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
     return allLocations
               .AsParallel()
               .Min(location => greatCircle(myLocation, location));
}

That being said, this is basically just aggregation with parallel constructs. You have a couple of options if you want to stick to the Parallel class. One option would be to synchronize this yourself within the block, using locking. I wouldn't recommend this, as it will hurt your overall performance.

The better option is to use the Parallel.ForEach methods which provide for local state. They would allow you to rewrite this as:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  object sync = new object();

  Parallel.ForEach<Location, Tuple<double,Location>(
      allLocations,
      () => new Tuple(double.MaxValue, null),
      (location, loopState, localState) =>
      {
          double d = greatCircle(myLocation, aLoc);
          if (d < localState.Item1)
              return new Tuple(d, aLoc);
          else
              return localState;
      },
      localState =>
      {
          lock(sync)
          {
              if (localState.Item1 < closest)
              {
                  closest = localState.Item1;
                  closestLoc = localState.Item2;
              }
          }
      }
  );
  return closestLoc;
}

I cover using local state for aggregations in detail on my blog. This basically changes the operation to one lock operation per thread instead of one lock per processing element, so you get much higher throughput than a naive locking solution.

回复收藏 0 原文

~没有更多了~