如何重构此 ForEach(..) 代码以使用 Parallel.ForEach(..)?

发布于 2024-11-02 18:01:17 字数 677 浏览 0 评论 0原文

我有一个对象列表,我希望将其从一个源复制到另一个源。 有人建议我可以通过使用 Parallel.ForEach

如何重构以下伪代码以利用 Parallel.ForEach(..)

var foos = GetFoos().ToList();
foreach(var foo in foos)
{
    CopyObjectFromOldBucketToNewBucket(foo, oldBucket, newBucket, 
        accessKeyId, secretAccessKey);
}

CopyObjectFromOldBucketToNewBucket 使用 Amazon REST API 将项目从一个存储桶移动到另一个存储桶。

干杯:)

i've got a list of objects which I wish to copy from one source to another. It was suggested that I could speed things up by using Parallel.ForEach

How can I refactor the following pseduo code to leverage Parallel.ForEach(..) ?

var foos = GetFoos().ToList();
foreach(var foo in foos)
{
    CopyObjectFromOldBucketToNewBucket(foo, oldBucket, newBucket, 
        accessKeyId, secretAccessKey);
}

CopyObjectFromOldBucketToNewBucket uses the Amazon REST APIs to move items from one bucket to another.

Cheers :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

灯角 2024-11-09 18:01:17

Parallel 实际上并不是这里的最佳选择。 Parallel 将并行运行您的代码,但仍会为每个对 AWS 的请求使用线程池线程。使用 BeginCopyObject 方法可以更好地利用资源。这不会耗尽等待响应的线程池线程,而只会在收到响应并需要处理时才使用它。

下面是如何使用 Begin/End 方法的简化示例。这些并不是 AWS 特有的,而是在 .NET BCL 中发现的一种模式。

public static CopyFoos() 
{
    var client = new AmazonS3Client(...);
    var foos = GetFoos().ToList();
    var asyncs = new List<IAsyncResult>();
    foreach(var foo in foos)
    {
        var request = new CopyObjectRequest { ... };  

        asyncs.Add(client.BeginCopyObject(request, EndCopy, client));
    }

    foreach(IAsyncResult ar in asyncs)
    {
        if (!ar.IsCompleted)
        {
            ar.AsyncWaitHandle.WaitOne();
        }
    }
}

private static EndCopy(IAsyncRequest ar) 
{    
    ((AmazonS3Client)ar.AsyncState).EndCopyObject(ar);
}

对于生产代码,您可能希望跟踪已发送的请求数量,并且每次仅发送有限数量的请求。测试或 AWS 文档可能会告诉您最佳并发请求数。

在这种情况下,当请求完成时,我们实际上不需要执行任何操作,因此您可能会想跳过 EndCopy 调用,但这会导致资源泄漏。每当调用 BeginXxx 时,您都必须调用相应的 EndXxx 方法。

Parallel is actually not the best option here. Parallel will run your code in parallel but will still use up a thread pool thread for each request to AWS. It would be far better use of resources to use the BeginCopyObject method instead. This will not use up a thread pool thread waiting on a response but will only utilize it when the response is received and needs to be processed.

Here's a simplified example of how to use Begin/End methods. These are not specific to AWS but is a pattern found throughout the .NET BCL.

public static CopyFoos() 
{
    var client = new AmazonS3Client(...);
    var foos = GetFoos().ToList();
    var asyncs = new List<IAsyncResult>();
    foreach(var foo in foos)
    {
        var request = new CopyObjectRequest { ... };  

        asyncs.Add(client.BeginCopyObject(request, EndCopy, client));
    }

    foreach(IAsyncResult ar in asyncs)
    {
        if (!ar.IsCompleted)
        {
            ar.AsyncWaitHandle.WaitOne();
        }
    }
}

private static EndCopy(IAsyncRequest ar) 
{    
    ((AmazonS3Client)ar.AsyncState).EndCopyObject(ar);
}

For production code you may want to keep track of how many requests you've dispatched and only send out a limited number at any one time. Testing or AWS docs may tell you how many concurrent requests are optimal.

In this case we don't actually need to do anything when the requests are completed so you may be tempted to skip the EndCopy calls but that would cause a resource leak. Whenever you call BeginXxx you must call the corresponding EndXxx method.

蓝眸 2024-11-09 18:01:17

由于您的代码除了 foos 之外没有任何依赖项,您可以简单地执行以下操作:

Parallel.ForEach(foos, ( foo => 
{
    CopyObjectFromOldBucketToNewBucket(foo, oldBucket, newBucket, 
                                       accessKeyId, secretAccessKey);
}));

不过请记住,I/O 只能并行到一定程度,之后性能实际上可能会下降。

Since your code doesn't have any dependencies other than to foos you can simply do:

Parallel.ForEach(foos, ( foo => 
{
    CopyObjectFromOldBucketToNewBucket(foo, oldBucket, newBucket, 
                                       accessKeyId, secretAccessKey);
}));

Keep in mind though, that I/O can only be parallelized to a certain degree, after that performance might actually degrade.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文