当我想要对同一输入调用多个昂贵的操作然后收集结果时,我可以使用 Scala 的并行集合吗?
我发现了一个类似的问题但是它有一个似乎更简单的情况,其中昂贵的操作总是相同的。就我而言,我想收集一些我想并行执行的昂贵 API 调用的一组结果。
假设我有:
def apiRequest1(q: Query): Option[Result]
def apiRequest2(q: Query): Option[Result]
其中 q
是相同的值。
我想要一个 List[Result]
或类似的(显然 List[Option[Result]]
很好),并且我希望这两个昂贵的操作并行发生。
当然,简单的 List
构造函数不会并行执行:
List(apiRequest1(q), apiRequest2(q))
并行集合有帮助吗?或者我应该寻找期货之类的东西?我能想到的使用并行集合的唯一方法似乎很老套:
List(q, q).par.zipWithIndex.flatMap((q) =>
if (q._2 % 2 == 0) apiRequest1(q._1) else apiRequest2(q._1)
)
实际上,所有事情都是平等的,也许这并没有那么糟糕......
I found a similar question but it has what seems to be a simpler case, where the expensive operation is always the same. In my case, I want to collect a set of results of some expensive API calls that I'd like to execute in parallel.
Say I have:
def apiRequest1(q: Query): Option[Result]
def apiRequest2(q: Query): Option[Result]
where q
is the same value.
I'd like a List[Result]
or similar (obviously List[Option[Result]]
is fine) and I'd like the two expensive operations to happen in parallel.
Naturally a simple List
constructor doesn't execute in parallel:
List(apiRequest1(q), apiRequest2(q))
Can the parallel collections help? Or should I be looking to futures and the like instead? The only approach I can think of using parallel collections seems hacky:
List(q, q).par.zipWithIndex.flatMap((q) =>
if (q._2 % 2 == 0) apiRequest1(q._1) else apiRequest2(q._1)
)
Actually, all things being equal, maybe that isn't so bad...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你为什么不写
Why don’t you write
快速而肮脏的解决方案:
Quick and dirty solution:
我不确定如果你只有两个或少量的调用,它实际上会并行工作,并行化有一个阈值,并且它可能会在如此小的集合中按顺序工作,因为它不值得并行化开销(它无法知道,因为它取决于您想要运行的操作,但对集合操作设置阈值是合理的)。
I m not sure it would actually work in parallel if you have only two or a small numbers of calls, there is a threshold for parallelization, and it would probably work sequentially with so small a collection, on the ground that it is not worth the parallelization overhead (it can't know that as it depends on the the operation you want to run, but it is reasonable to have a threshold on collection operations).