如何在scala中优化这个短阶乘函数? (创建 50000 个 BigInt)
我将 scala 版本
(BigInt(1) to BigInt(50000)).reduce(_ * _)
与 python 版本
reduce(lambda x,y: x*y, range(1,50000))
进行了比较,结果发现,scala 版本比 python 版本花费的时间大约是 python 版本的 10 倍。
我猜测,一个很大的区别是 python 可以使用其原生的 long 类型,而不是为每个数字创建新的 BigInt 对象。但是scala有解决方法吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
事实上,您的 Scala 代码创建了 50,000 个
BigInt
对象,这一事实不太可能在这里产生太大影响。一个更大的问题是乘法算法——Python 的long
使用 Karatsuba 乘法 和Java 的BigInteger
(BigInt
只是包装)没有。最简单的解决方法可能是切换到更好的任意精度数学库,例如 JScience 的:
这比我的机器上的Python解决方案。
更新:我已经使用 一些快速基准测试代码。 google.com/p/caliper/" rel="nofollow noreferrer">Caliper 响应 Luigi Plingi 的回答,这在我的(四核)机器上给出了以下结果:
我看不出
reduce
和fold
他所做的,但寓意很明确:如果您可以使用 Scala 2.9 的并行集合,它们会给您带来巨大的改进,但切换到LargeInteger
会有所帮助以及。The fact that your Scala code creates 50,000
BigInt
objects is unlikely to be making much of a difference here. A bigger issue is the multiplication algorithm—Python'slong
uses Karatsuba multiplication and Java'sBigInteger
(whichBigInt
just wraps) doesn't.The easiest workaround is probably to switch to a better arbitrary precision math library, like JScience's:
This is faster than the Python solution on my machine.
Update: I've written some quick benchmarking code using Caliper in response to Luigi Plingi's answer, which gives the following results on my (quad core) machine:
I don't see the difference between
reduce
andfold
that he does, but the moral is clear: if you can use Scala 2.9's parallel collections, they'll give you a huge improvement, but switching toLargeInteger
helps as well.我的机器上的 Python:
给出了 1219 毫秒
Scala:
这 689 毫秒和 361 毫秒是在几次热身运行之后。它们都在大约 1000 毫秒开始,但似乎预热的时间不同。并行集合似乎比非并行集合预热得更多:非并行操作与第一次运行相比并没有显着减少。
.par
(意思是,使用并行集合)似乎比reduce
更快地加快了fold
的速度。我只有 2 个核心,但更多数量的核心应该会带来更大的性能提升。因此,通过实验,优化此函数的方法是
a) 使用
fold
而不是reduce
b) 使用并行集合
update:
受到将计算分解为更小的块可以加快速度这一观察结果的启发,我设法让他在我的机器上以
215 ms
运行,这比标准并行算法提高了 40%。 (使用 BigInt,需要 615 毫秒。)此外,它不使用并行集合,但以某种方式使用 90% CPU(与 BigInt 不同)。Python on my machine:
gives
1219 ms
Scala:
This 689 ms and 361 ms were after a few warmup runs. They both started at about 1000 ms, but seem to warm up by different amounts. The parallel collections seem to warm up significantly more than the non-parallel: the non-parallel operations did not reduce significantly from their first runs.
The
.par
(meaning, use parallel collections) seemed to speed upfold
more thanreduce
. I only have 2 cores, but a greater number of cores should see a bigger performance gain.So, experimentally, the way to optimize this function is
a) Use
fold
rather thanreduce
b) Use parallel collections
update:
Inspired by the observation that breaking the calculation down into smaller chunks speeds things up, I managed to get he following to run in
215 ms
on my machine, which is a 40% improvement on the standard parallelized algorithm. (Using BigInt, it takes 615 ms.) Also, it doesn't use parallel collections, but somehow uses 90% CPU (unlike for BigInt).这里的另一个技巧可能是尝试
reduceLeft
和reduceRight
来看看哪个最快。在您的示例中,我的reduceRight
执行速度要快得多:foldLeft
和foldRight
之间的差异相同。猜猜从树的哪一侧开始减少很重要:)Another trick here could be to try both
reduceLeft
andreduceRight
to see what is fastest. On your example I get a much faster execution ofreduceRight
:Same difference between
foldLeft
andfoldRight
. Guess it matters what side of the tree you start reducing from :)在 Scala 中计算阶乘的最有效方法是使用分而治之的策略:
此外,为了获得更快的速度,请使用最新版本的 JDK 和以下 JVM 选项:
下面是 Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz 的结果(最大 3.50GHz)、RAM 12Gb DDR3-1333、Windows 7 sp1、Oracle JDK 1.8.0_25-b18 64 位:
顺便说一句,要提高大于 20000 的数字的阶乘计算效率,请使用以下实现 Schönhage-Strassen 算法或等待直到它被合并到 JDK 9 并且 Scala 将能够使用它
Most efficient way to calculate factorial in Scala is using of divide and conquer strategy:
Also to get more speed use latest version of JDK and following JVM options:
Bellow are results for Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz (max 3.50GHz), RAM 12Gb DDR3-1333, Windows 7 sp1, Oracle JDK 1.8.0_25-b18 64-bit:
BTW to improve efficience of factorial calculation for numbers that greater than 20000 use following implementation of Schönhage-Strassen algorithm or wait until it will be merged to JDK 9 and Scala will be able to use it