有效地计算组合和排列
我有一些代码来计算排列和组合,并且我正在努力使其更好地处理大量数字。
我找到了一种更好的排列算法,可以避免大量的中间结果,但我仍然认为我可以在组合方面做得更好。
到目前为止,我已经放置了一个特殊情况来反映 nCr 的对称性,但我仍然希望找到一种更好的算法来避免调用阶乘(r),这是一个不必要的大中间结果。如果没有这种优化,最后一个文档测试会花费太长时间来计算阶乘(99000)。
谁能建议一种更有效的方法来计算组合?
from math import factorial
def product(iterable):
prod = 1
for n in iterable:
prod *= n
return prod
def npr(n, r):
"""
Calculate the number of ordered permutations of r items taken from a
population of size n.
>>> npr(3, 2)
6
>>> npr(100, 20)
1303995018204712451095685346159820800000
"""
assert 0 <= r <= n
return product(range(n - r + 1, n + 1))
def ncr(n, r):
"""
Calculate the number of unordered combinations of r items taken from a
population of size n.
>>> ncr(3, 2)
3
>>> ncr(100, 20)
535983370403809682970
>>> ncr(100000, 1000) == ncr(100000, 99000)
True
"""
assert 0 <= r <= n
if r > n // 2:
r = n - r
return npr(n, r) // factorial(r)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
如果 n 离 r 不远,那么使用组合的递归定义可能会更好,因为 xC0 == 1 你只会有几次迭代:
这里相关的递归定义是:
nCr = (n-1)C(r-1 ) * n/r
这可以使用尾递归通过以下列表很好地计算:
[(n - r, 0), (n - r + 1, 1), (n - r + 2, 2), ... , (n - 1, r - 1), (n, r)]
这当然很容易在 Python 中生成(我们省略了第一个条目,因为 nC0 = 1)通过 izip(xrange(n - r + 1, n+1), xrange(1, r+1)) 请注意,这假设 r <= n 您需要检查这一点,如果不是,则交换它们。如果 r < ,还可以优化使用n/2 则 r = n - r。
现在我们只需要使用带有reduce 的尾递归来应用递归步骤。我们从 1 开始,因为 nC0 是 1,然后将当前值与列表中的下一个条目相乘,如下所示。
if n is not far from r then using the recursive definition of combination is probably better, since xC0 == 1 you will only have a few iterations:
The relevant recursive definition here is:
nCr = (n-1)C(r-1) * n/r
This can be nicely computed using tail recursion with the following list:
[(n - r, 0), (n - r + 1, 1), (n - r + 2, 2), ..., (n - 1, r - 1), (n, r)]
which is of course easily generated in Python (we omit the first entry since nC0 = 1) by
izip(xrange(n - r + 1, n+1), xrange(1, r+1))
Note that this assumes r <= n you need to check for that and swap them if they are not. Also to optimize use if r < n/2 then r = n - r.Now we simply need to apply the recursion step using tail recursion with reduce. We start with 1 since nC0 is 1 and then multiply the current value with the next entry from the list as below.
两个相当简单的建议:
为了避免溢出,请在日志空间中执行所有操作。使用 log(a * b) = log(a) + log(b) 和 log(a / b) = log(a) - log(b) 的事实。这使得处理非常大的阶乘变得容易:log(n! / m!) = log(n!) - log(m!) 等。
使用 gamma 函数而不是阶乘。您可以在 scipy.stats.loggamma 中找到一个。这是一种比直接求和更有效的计算对数阶乘的方法。
loggamma(n) == log(factorial(n - 1))
,类似地,gamma(n) == Factorial(n - 1)
。Two fairly simple suggestions:
To avoid overflow, do everything in log space. Use the fact that log(a * b) = log(a) + log(b), and log(a / b) = log(a) - log(b). This makes it easy to work with very large factorials: log(n! / m!) = log(n!) - log(m!), etc.
Use the gamma function instead of factorial. You can find one in
scipy.stats.loggamma
. It's a much more efficient way to calculate log-factorials than direct summation.loggamma(n) == log(factorial(n - 1))
, and similarly,gamma(n) == factorial(n - 1)
.scipy 中有一个尚未提及的函数:scipy.special.comb。根据 doctest 的一些快速计时结果,它似乎很有效(
comb(100000, 1000, 1) == Comb(100000, 99000, 1)
的时间约为 0.004 秒)。[虽然这个具体问题似乎与算法有关,但问题 python 中是否有数学 ncr 函数 被标记为与此重复...]
There's a function for this in scipy which hasn't been mentioned yet: scipy.special.comb. It seems efficient based on some quick timing results for your doctest (~0.004 seconds for
comb(100000, 1000, 1) == comb(100000, 99000, 1)
).[While this specific question seems to be about algorithms the question is there a math ncr function in python is marked as a duplicate of this...]
对于 Python 3.7 之前的版本:
对于 Python 3.8+:
math.perm( )
math.comb()
有趣的是,组合函数的一些手动实现可能比
math.comb()
更快:所以实际上
comb_perm() (使用 math.perm() 和 math.factorial() 实现)实际上比大多数的 math.comb() 更快这些基准的时间,显示固定
n=256
和增加k
(直到k = n // 2
)的计算时间。请注意,
comb_reduce()
速度相当慢,但本质上与 @wich 的答案,而comb_iter()
也相对较慢,本质上与 @ZXX 的答案。此处进行部分分析(不含
comb_math()< /code> 和
comb_perm()
因为它们在 Colab 的 Python 版本(3.7)中不受支持(截至上次编辑)。For Python until 3.7:
For Python 3.8+:
math.perm()
math.comb()
Interestingly enough, some manual implementation of the combination function may be faster than
math.comb()
:so that actually
comb_perm()
(implemented withmath.perm()
andmath.factorial()
) is actually faster thanmath.comb()
most of the times for these benchamarks, which show the computation time for fixedn=256
and increasingk
(up untilk = n // 2
).Note that
comb_reduce()
, which is quite slow, is essentially the same approach as from @wich's answer, whilecomb_iter()
, also relatively slow, is essentially the same approach as @ZXX's answer.Partial analysis here (without
comb_math()
andcomb_perm()
since they are not supported in Python's version of Colab -- 3.7 -- as of last edit).如果您不需要纯 python 解决方案,gmpy2 可能会有所帮助(
gmpy2 .comb
非常快)。If you don't need a pure-python solution, gmpy2 might help (
gmpy2.comb
is very fast).更有效的 nCr 解决方案 - 空间方面和精度方面。
中介 (res) 保证始终为 int 并且永远不会大于结果。空间复杂度为 O(1)(无列表、无 zip、无堆栈),时间复杂度为 O(r) - 恰好是 r 次乘法和 r 次除法。
More efficient solution for nCr - space wise and precision wise.
The intermediary (res) is guaranteed to always be int and never larger than the result. Space complexity is O(1) (no lists, no zips, no stack), time complexity is O(r) - exactly r multiplications and r divisions.
应该允许你计算组合
should allow you to count combinations
如果您正在计算 N 选择 K (这就是我认为您正在使用 ncr 所做的事情),则有一个动态编程解决方案可能会快得多。这将避免阶乘,而且如果您想以后使用,您可以保留该表。
这是一个教学链接:
http://www .csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html
不过,我不确定如何更好地解决您的第一个问题。
编辑:这是模型。有一些非常搞笑的差一错误,所以它肯定可以承受更多的清理。
If you are computing N choose K (which is what I think you're doing with ncr), there is a dynamic programming solution that may be a lot faster. This will avoid factorial, plus you can keep the table if you want for later use.
Here is a teaching link for it:
http://www.csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html
I am unsure of how to better solve your first problem, though, sorry.
Edit: Here is the mock-up. There are some pretty hilarious off-by-one errors, so it can certainly stand some more clean up.
如果您的问题不需要知道排列或组合的确切数量,那么您可以使用斯特林近似< /a> 为阶乘。
这会导致这样的代码:
If your problem does not require knowing the exact number of permutations or combinations, then you could use Stirling's approximation for the factorial.
That would lead to code like this:
使用 xrange() 代替 range() 会稍微加快速度,因为不会创建、填充、迭代然后销毁中间列表。此外,还有
reduce()
和operator.mul
。Using
xrange()
instead ofrange()
will speed things up slightly due to the fact that no intermediate list is created, populated, iterated through, and then destroyed. Also,reduce()
withoperator.mul
.对于 N 选择 K,您可以使用帕斯卡三角形。基本上,您需要保留大小为 N 的数组来计算所有 N 个选择的 K 值。只需要添加即可。
For N choose K you could use Pascals triangle. Basically you would need to keep array of size N around to compute all the N choose K values. Only additions would be required.
您可以输入两个整数并导入数学库来查找阶乘,然后应用 nCr 公式
You could input two integers and import math library to find the factorial and then apply the nCr formula
很简单。只需从
math
模块导入comb
和perm
函数即可获得结果!完整代码如下:
Very simple. Just import
comb
andperm
functions frommath
module and get the result!!Complete code is below: