Parallel.ForEach 可以与 CloudTableQuery 一起安全使用吗

发布于 2024-09-18 08:04:33 字数 248 浏览 14 评论 0原文

我在 Azure 表中有合理数量的记录,我正在尝试对其进行一次性数据加密。我认为我可以通过使用 Parallel.ForEach 来加快速度。另外,因为有超过 1K 条记录,而且我不想自己搞乱连续令牌,所以我使用 CloudTableQuery 来获取我的枚举器。

我的问题是,我的一些记录已被双重加密,并且我意识到我不确定 CloudTableQuery.Execute() 返回的枚举器的线程安全性如何。还有其他人有过这种组合的经验吗?

I have a reasonable number of records in an Azure Table that I'm attempting to do some one time data encryption on. I thought that I could speed things up by using a Parallel.ForEach. Also because there are more than 1K records and I don't want to mess around with continuation tokens myself I'm using a CloudTableQuery to get my enumerator.

My problem is that some of my records have been double encrypted and I realised that I'm not sure how thread safe the enumerator returned by CloudTableQuery.Execute() is. Has anyone else out there had any experience with this combination?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

メ斷腸人バ 2024-09-25 08:04:33

我愿意打赌,执行返回线程安全的 IEnumerator 实现的答案是极不可能的。也就是说,这听起来像是生产者-消费者模式的又一个案例。

在您的特定场景中,我会让名为 Execute 的原始线程按顺序读取结果并将其填充到 BlockingCollection。不过,在开始执行此操作之前,您需要启动一个单独的 Task,该任务将使用 Parallel::ForEach 控制这些项目的消耗。现在,您可能还想考虑使用 ParallelExtensions 库的 GetConsumingPartitioner 方法,以提高效率,因为在这种情况下,默认分区程序会产生比您想要的更多的开销。您可以从这篇博文。

与原始 ConcurrentQueueu 相比,使用 BlockingCollection 的另一个好处是它提供了 设置界限,这可以帮助阻止生产者向集合中添加超出消费者可以跟上的项目。当然,您需要进行一些性能测试来找到适合您的应用程序的最佳点。

I would be willing to bet the answer to Execute returning a thread-safe IEnumerator implementation is highly unlikely. That said, this sounds like yet another case for the producer-consumer pattern.

In your specific scenario I would have the original thread that called Execute read the results off sequentially and stuff them into a BlockingCollection<T>. Before you start doing that though, you want to start a separate Task that will control the consumption of those items using Parallel::ForEach. Now, you will probably also want to look into using the GetConsumingPartitioner method of the ParallelExtensions library in order to be most efficient since the default partitioner will create more overhead than you want in this case. You can read more about this from this blog post.

An added bonus of using BlockingCollection<T> over a raw ConcurrentQueueu<T> is that it offers the ability to set bounds which can help block the producer from adding more items to the collection than the consumers can keep up with. You will of course need to do some performance testing to find the sweet spot for your application.

深海不蓝 2024-09-25 08:04:33

尽管我尽了最大努力,我还是无法复制我原来的问题。因此,我的结论是,将 Parallel.ForEach 循环与 CloudTableQuery.Execute() 结合使用是完全可以的。

Despite my best efforts I've been unable to replicate my original problem. My conclusion is therefore that it is perfectly OK to use Parallel.ForEach loops with CloudTableQuery.Execute().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文