订购 PLINQ ForAll

发布于 2024-10-24 00:52:34 字数 709 浏览 1 评论 0原文

有关 PLINQ 中的订单保留的 msdn 文档声明了以下有关 ForAll()

  • 源序列有序时的结果:非确定性并行执行
  • 源序列无序时的结果:非确定性并行执行 em>

这是否意味着永远无法保证 ForAll 方法的有序执行?

我以前没有使用过 PLINQ,但以下代码审查问题似乎是它的合适用法。在我的答案的底部我写道:

Events.AsParallel().AsOrdered().ForAll( eventItem =>
{
    ...
} );    

阅读文档后,我相信 AsOrdered() 不会改变任何东西?
我还怀疑前面的查询无法替换简单的 < code>for 循环哪里顺序很重要?
可能也会发生对 StringBuilder 的并行调用,从而导致错误的输出?

The msdn documentation about order preservation in PLINQ states the following about ForAll().

  • Result when the source sequence is ordered: Executes nondeterministically in parallel
  • Result when the source sequence is unordered: Executes nondeterministically in parallel

Does this mean that ordered execution of the ForAll method is never guaranteed?

I haven't used PLINQ before, but the following Code Review question seemed like an appropriate usage for it. At the bottom of my answer I write:

Events.AsParallel().AsOrdered().ForAll( eventItem =>
{
    ...
} );    

After reading the documentation I believe the AsOrdered() wouldn't change anything?
I'm also suspecting the previous query can't replace a simple for loop where order is important?
Probably parallel calls to the StringBuilder will also occur, resulting in a wrong output?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

夜灵血窟げ 2024-10-31 00:52:34

顺序保留通常仅适用于结果 - 即可以按任何顺序处理输入,但按原始顺序返回

由于 ForAll 不返回任何内容,因此据我所知,它实际上没有任何效果。

使排序应用于处理的唯一方法是在处理项目 1 之前完成项目 0,在处理项目 2 之前等等……此时您就没有并行性了。

Order preservation is usually only applied to results - i.e. the input can be processed in any order, but is returned in the original order.

As ForAll doesn't return anything, it doesn't really have any effect that I'm aware of.

The only way of making ordering apply to the processing would be to finish item 0 before processing item 1, before processing item 2 etc... at which point you've got no parallelism.

夏末 2024-10-31 00:52:34

正如其他人正确回答的那样,ForAll 方法永远不能保证以任何特定顺序对可枚举元素执行操作,并且会默默地忽略 AsOrdered() 方法调用。

为了让读者有充分的理由以尽可能接近原始顺序(只要在并行处理上下文中合理)的方式对可枚举元素执行操作,下面的扩展方法可能会有所帮助。

public static void ForAllInApproximateOrder<TSource>(this ParallelQuery<TSource> source, Action<TSource> action) {

    Partitioner.Create( source )
               .AsParallel()
               .AsOrdered()
               .ForAll( e => action( e ) );

}

然后可以按如下方式使用:

orderedElements.AsParallel()
               .ForAllInApproximateOrder( e => DoSomething( e ) );

应该注意的是,上述扩展方法使用 PLINQ ForAll 而不是 Parallel.ForEach,因此继承了 PLINQ 内部使用的线程模型(这与 Parallel.ForEach 使用的不同——根据我的经验,默认情况下不那么激进)。下面是使用 Parallel.ForEach 的类似扩展方法。

public static void ForEachInApproximateOrder<TSource>(this ParallelQuery<TSource> source, Action<TSource> action) {

    source = Partitioner.Create( source )
                        .AsParallel()
                        .AsOrdered();

    Parallel.ForEach( source , e => action( e ) );

}

然后可以按如下方式使用:

orderedElements.AsParallel()
               .ForEachInApproximateOrder( e => DoSomething( e ) );

使用上述任一扩展方法时,无需将 AsOrdered() 链接到查询,无论如何它都会在内部调用。

我发现这些方法在处理具有粗粒度意义的元素时很有用。例如,它可以用于处理从最旧的记录开始并逐渐向最新的记录。在许多情况下,不需要记录的确切顺序 - 只要旧记录通常在新记录之前得到处理。类似地,可以处理具有低/中/高优先级的记录,使得对于大多数情况,高优先级记录将在低优先级记录之前被处理,边缘情况紧随其后。

As others have rightly answered, the ForAll method is never guaranteed to execute an action for enumerable elements in any particular order, and will ignore the AsOrdered() method call silently.

For the benefit of readers having a valid reason to execute an action for enumerable elements in a way that stay's as close to the original order (as far as is reasonable in a parallel processing context) the extension methods below might help.

public static void ForAllInApproximateOrder<TSource>(this ParallelQuery<TSource> source, Action<TSource> action) {

    Partitioner.Create( source )
               .AsParallel()
               .AsOrdered()
               .ForAll( e => action( e ) );

}

This can then be used as follows:

orderedElements.AsParallel()
               .ForAllInApproximateOrder( e => DoSomething( e ) );

It should be noted that the above extension method uses PLINQ ForAll and not Parallel.ForEach and so inherits the threading model used interally by PLINQ (which is different to that used by Parallel.ForEach -- less aggressive by default in my experience). A similar extension method using Parallel.ForEach is below.

public static void ForEachInApproximateOrder<TSource>(this ParallelQuery<TSource> source, Action<TSource> action) {

    source = Partitioner.Create( source )
                        .AsParallel()
                        .AsOrdered();

    Parallel.ForEach( source , e => action( e ) );

}

This can then be used as follows:

orderedElements.AsParallel()
               .ForEachInApproximateOrder( e => DoSomething( e ) );

There is no need to chain AsOrdered() to your query when using either of the above extension methods, it gets called internally anyway.

I have found these methods useful in processing elements that have coarse grained significance. It can be useful, for example, to process records starting at the oldest and working towards the newest. In many cases the exact order of records isn't required - so far as older records generally get processed before newer records. Similarly, records having low/med/high priority levels can be processed such that high priority records will be processed before lower priority records for the majority of cases, with the edge cases not far behind.

寄与心 2024-10-31 00:52:34

AsOrdered() 不会改变任何内容 - 如果您想对并行查询的结果强制执行顺序,您只需使用 foreach() ForAll()< /code> 是为了利用并行性,这意味着一次对集合中的多个项目执行副作用。事实上,排序仅适用于查询结果(结果集合中项目的顺序),但这与 ForAll() 无关,因为 ForAll() 完全不影响顺序。

在 PLINQ 中,目标是最大化
性能的同时保持
正确性。查询应运行为
尽可能快但仍然产生
正确的结果。在某些情况下,
正确性要求保留源序列的顺序

请注意,ForAll() 不会转换集合(它不是投影到新集合),它纯粹是为了执行对 PLINQ 查询结果的副作用。

AsOrdered() wouldn't change anything - if you want to enforce order on the result of a parallel query you can simply use foreach() ForAll() is there to take advantage of parallelism, that means executing the side effect on more than one item in the collection at a time. In fact ordering only applies to the results of a query (the order of items in the result collection), but this has nothing to do with ForAll(), since ForAll() does not affect the order at all.

In PLINQ, the goal is to maximize
performance while maintaining
correctness. A query should run as
fast as possible but still produce the
correct results. In some cases,
correctness requires the order of the source sequence to be preserved

Note that ForAll() is not transforming the collection (it's not i.e projecting to a new collection), it's purely for executing side effects on the results of a PLINQ query.

百思不得你姐 2024-10-31 00:52:34

这是否意味着永远无法保证 ForAll 方法的有序执行?

是的 - 不保证订单。

并行化意味着工作被分配给不同的线程,然后将它们单独的输出组合起来。

如果您需要对输出进行排序,则不要使用 PLinq - 或添加一些后续步骤来重新排序。


此外,如果您从 plinq 执行中访问 StringBuilder 等对象,请确保这些对象是线程安全的- 并且还要注意,这种线程安全实际上可能会使 plinq 比非并行 linq 慢。

Does this mean that ordered execution of the ForAll method is never guaranteed?

Yes - order is not guaranteed.

The parallelisation means that the work is allocated to different threads and their separate outputs are then later combined.

If you need to order the output then don't use PLinq - or add some later step to put the ordering back in.


Also, if you are accessing objects like a StringBuilder from within the plinq execution, then please ensure that those objects are threadsafe - and also be aware that this thread safety may in fact make the plinq slower than the non-parallel linq.

夏の忆 2024-10-31 00:52:34

现在作为一种扩展方法:

它将在多个核心上进行处理,然后对结果进行排序,因此存在排序的开销。这是关于简单与并行的基准测试的答案

 public static IEnumerable<T1> OrderedParallel<T, T1>(this IEnumerable<T> list, Func<T, T1> action)
    {
        var unorderedResult = new ConcurrentBag<(long, T1)>();
        Parallel.ForEach(list, (o, state, i) =>
        {
            unorderedResult.Add((i, action.Invoke(o)));
        });
        var ordered = unorderedResult.OrderBy(o => o.Item1);
        return ordered.Select(o => o.Item2);
    }

使用类似:

var result = Events.OrderedParallel(eventItem => ...);

希望这会节省您一些时间。

Now as an extension method:

It will process on multiple cores then will order the results, so there's the overhead of ordering. Here's an answer on benchmarking simple for vs parallel.

 public static IEnumerable<T1> OrderedParallel<T, T1>(this IEnumerable<T> list, Func<T, T1> action)
    {
        var unorderedResult = new ConcurrentBag<(long, T1)>();
        Parallel.ForEach(list, (o, state, i) =>
        {
            unorderedResult.Add((i, action.Invoke(o)));
        });
        var ordered = unorderedResult.OrderBy(o => o.Item1);
        return ordered.Select(o => o.Item2);
    }

use like:

var result = Events.OrderedParallel(eventItem => ...);

Hope this will save you some time.

执笔绘流年 2024-10-31 00:52:34

ForAll 在多个线程中并行运行该操作。在任何给定时刻,多个操作都会同时运行,在这些情况下,“顺序”的概念不适用。要按顺序运行操作,您必须按顺序运行它们,最简单的方法是在单个线程中运行它们。这可以通过在标准 foreach 循环中枚举查询结果来实现:

var query = Events.AsParallel().AsOrdered();
foreach (var eventItem in query)
{
    // do something with the eventItem
}

如果您更喜欢流畅的 ForAll 语法,您可以在项目中添加一个静态类下面的 ForEach 扩展方法:

public static void ForEach<TSource>(this IEnumerable<TSource> source,
    Action<TSource> action)
{
    foreach (TSource item in source)
    {
        action(item);
    }
}

并像这样使用它:

Events.AsParallel().AsOrdered().ForEach(eventItem =>
{
    // do something with the eventItem
});

应该注意的是,在给定的示例中,并行 LINQ 的使用是多余的。查询 Events.AsParallel().AsOrdered() 不会对源枚举执行任何转换,因此不会发生任何实际计算。您可以删除 .AsParallel().AsOrdered() 部分并获得相同的结果。

The ForAll runs the action in multiple threads, in parallel. At any given moment multiple actions will be running concurrently, and in these circumstances the notion of "order" is not applicable. To run the actions in order you must run them sequentially, and the simplest way to do it is to run them in a single thread. This can be achieved by just enumerating the result of the query in a standard foreach loop:

var query = Events.AsParallel().AsOrdered();
foreach (var eventItem in query)
{
    // do something with the eventItem
}

If you prefer the fluent ForAll syntax, you can add a static class in your project with the ForEach extension method below:

public static void ForEach<TSource>(this IEnumerable<TSource> source,
    Action<TSource> action)
{
    foreach (TSource item in source)
    {
        action(item);
    }
}

And use it like this:

Events.AsParallel().AsOrdered().ForEach(eventItem =>
{
    // do something with the eventItem
});

It should be noted though that in the given example the use of Parallel LINQ is redundant. The query Events.AsParallel().AsOrdered() performs no transformation to the source enumerable, so no actual computation is taking place. You could remove the .AsParallel().AsOrdered() part and get the same outcome.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文