有效地计算组合和排列

发布于 2024-08-18 06:04:46 字数 1094 浏览 12 评论 0 原文

我有一些代码来计算排列和组合,并且我正在努力使其更好地处理大量数字。

我找到了一种更好的排列算法,可以避免大量的中间结果,但我仍然认为我可以在组合方面做得更好。

到目前为止,我已经放置了一个特殊情况来反映 nCr 的对称性,但我仍然希望找到一种更好的算法来避免调用阶乘(r),这是一个不必要的大中间结果。如果没有这种优化,最后一个文档测试会花费太长时间来计算阶乘(99000)。

谁能建议一种更有效的方法来计算组合?

from math import factorial

def product(iterable):
    prod = 1
    for n in iterable:
        prod *= n
    return prod

def npr(n, r):
    """
    Calculate the number of ordered permutations of r items taken from a
    population of size n.

    >>> npr(3, 2)
    6
    >>> npr(100, 20)
    1303995018204712451095685346159820800000
    """
    assert 0 <= r <= n
    return product(range(n - r + 1, n + 1))

def ncr(n, r):
    """
    Calculate the number of unordered combinations of r items taken from a
    population of size n.

    >>> ncr(3, 2)
    3
    >>> ncr(100, 20)
    535983370403809682970
    >>> ncr(100000, 1000) == ncr(100000, 99000)
    True
    """
    assert 0 <= r <= n
    if r > n // 2:
        r = n - r
    return npr(n, r) // factorial(r)

I have some code to count permutations and combinations, and I'm trying to make it work better for large numbers.

I've found a better algorithm for permutations that avoids large intermediate results, but I still think I can do better for combinations.

So far, I've put in a special case to reflect the symmetry of nCr, but I'd still like to find a better algorithm that avoids the call to factorial(r), which is an unnecessarily large intermediate result. Without this optimization, the last doctest takes too long trying to calculate factorial(99000).

Can anyone suggest a more efficient way to count combinations?

from math import factorial

def product(iterable):
    prod = 1
    for n in iterable:
        prod *= n
    return prod

def npr(n, r):
    """
    Calculate the number of ordered permutations of r items taken from a
    population of size n.

    >>> npr(3, 2)
    6
    >>> npr(100, 20)
    1303995018204712451095685346159820800000
    """
    assert 0 <= r <= n
    return product(range(n - r + 1, n + 1))

def ncr(n, r):
    """
    Calculate the number of unordered combinations of r items taken from a
    population of size n.

    >>> ncr(3, 2)
    3
    >>> ncr(100, 20)
    535983370403809682970
    >>> ncr(100000, 1000) == ncr(100000, 99000)
    True
    """
    assert 0 <= r <= n
    if r > n // 2:
        r = n - r
    return npr(n, r) // factorial(r)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

无妨# 2024-08-25 06:04:46

如果 n 离 r 不远,那么使用组合的递归定义可能会更好,因为 xC0 == 1 你只会有几次迭代:

这里相关的递归定义是:

nCr = (n-1)C(r-1 ) * n/r

这可以使用尾递归通过以下列表很好地计算:

[(n - r, 0), (n - r + 1, 1), (n - r + 2, 2), ... , (n - 1, r - 1), (n, r)]

这当然很容易在 Python 中生成(我们省略了第一个条目,因为 nC0 = 1)通过 izip(xrange(n - r + 1, n+1), xrange(1, r+1)) 请注意,这假设 r <= n 您需要检查这一点,如果不是,则交换它们。如果 r < ,还可以优化使用n/2 则 r = n - r。

现在我们只需要使用带有reduce 的尾递归来应用递归步骤。我们从 1 开始,因为 nC0 是 1,然后将当前值与列表中的下一个条目相乘,如下所示。

from itertools import izip

reduce(lambda x, y: x * y[0] / y[1], izip(xrange(n - r + 1, n+1), xrange(1, r+1)), 1)

if n is not far from r then using the recursive definition of combination is probably better, since xC0 == 1 you will only have a few iterations:

The relevant recursive definition here is:

nCr = (n-1)C(r-1) * n/r

This can be nicely computed using tail recursion with the following list:

[(n - r, 0), (n - r + 1, 1), (n - r + 2, 2), ..., (n - 1, r - 1), (n, r)]

which is of course easily generated in Python (we omit the first entry since nC0 = 1) by izip(xrange(n - r + 1, n+1), xrange(1, r+1)) Note that this assumes r <= n you need to check for that and swap them if they are not. Also to optimize use if r < n/2 then r = n - r.

Now we simply need to apply the recursion step using tail recursion with reduce. We start with 1 since nC0 is 1 and then multiply the current value with the next entry from the list as below.

from itertools import izip

reduce(lambda x, y: x * y[0] / y[1], izip(xrange(n - r + 1, n+1), xrange(1, r+1)), 1)
猫卆 2024-08-25 06:04:46

两个相当简单的建议:

  1. 为了避免溢出,请在日志空间中执行所有操作。使用 log(a * b) = log(a) + log(b) 和 log(a / b) = log(a) - log(b) 的事实。这使得处理非常大的阶乘变得容易:log(n! / m!) = log(n!) - log(m!) 等。

  2. 使用 gamma 函数而不是阶乘。您可以在 scipy.stats.loggamma 中找到一个。这是一种比直接求和更有效的计算对数阶乘的方法。 loggamma(n) == log(factorial(n - 1)),类似地,gamma(n) == Factorial(n - 1)

Two fairly simple suggestions:

  1. To avoid overflow, do everything in log space. Use the fact that log(a * b) = log(a) + log(b), and log(a / b) = log(a) - log(b). This makes it easy to work with very large factorials: log(n! / m!) = log(n!) - log(m!), etc.

  2. Use the gamma function instead of factorial. You can find one in scipy.stats.loggamma. It's a much more efficient way to calculate log-factorials than direct summation. loggamma(n) == log(factorial(n - 1)), and similarly, gamma(n) == factorial(n - 1).

半边脸i 2024-08-25 06:04:46

scipy 中有一个尚未提及的函数:scipy.special.comb。根据 doctest 的一些快速计时结果,它似乎很有效(comb(100000, 1000, 1) == Comb(100000, 99000, 1) 的时间约为 0.004 秒)。

[虽然这个具体问题似乎与算法有关,但问题 python 中是否有数学 ncr 函数 被标记为与此重复...]

There's a function for this in scipy which hasn't been mentioned yet: scipy.special.comb. It seems efficient based on some quick timing results for your doctest (~0.004 seconds for comb(100000, 1000, 1) == comb(100000, 99000, 1)).

[While this specific question seems to be about algorithms the question is there a math ncr function in python is marked as a duplicate of this...]

陈独秀 2024-08-25 06:04:46

对于 Python 3.7 之前的版本:

def prod(items, start=1):
    for item in items:
        start *= item
    return start


def perm(n, k):
    if not 0 <= k <= n:
        raise ValueError(
            'Values must be non-negative and n >= k in perm(n, k)')
    else:
        return prod(range(n - k + 1, n + 1))


def comb(n, k):
    if not 0 <= k <= n:
        raise ValueError(
            'Values must be non-negative and n >= k in comb(n, k)')
    else:
        k = k if k < n - k else n - k
        return prod(range(n - k + 1, n + 1)) // math.factorial(k)

对于 Python 3.8+:


有趣的是,组合函数的一些手动实现可能比 math.comb() 更快:

def math_comb(n, k):
    return math.comb(n, k)


def comb_perm(n, k):
    k = k if k < n - k else n - k
    return math.perm(n, k) // math.factorial(k)


def comb(n, k):
    k = k if k < n - k else n - k
    return prod(range(n - k + 1, n + 1)) // math.factorial(k)


def comb_other(n, k):
    k = k if k > n - k else n - k
    return prod(range(n - k + 1, n + 1)) // math.factorial(k)


def comb_reduce(n, k):
    k = k if k < n - k else n - k
    return functools.reduce(
        lambda x, y: x * y[0] // y[1],
        zip(range(n - k + 1, n + 1), range(1, k + 1)),
        1)


def comb_iter(n, k):
    k = k if k < n - k else n - k
    result = 1
    for i in range(1, k + 1):
        result = result * (n - i + 1) // i
    return result


def comb_iterdiv(n, k):
    k = k if k < n - k else n - k
    result = divider = 1
    for i in range(1, k + 1):
        result *= (n - i + 1)
        divider *= i
    return result // divider


def comb_fact(n, k):
    k = k if k < n - k else n - k
    return math.factorial(n) // math.factorial(n - k) // math.factorial(k)

bm

所以实际上 comb_perm() (使用 math.perm() 和 math.factorial() 实现)实际上比大多数的 math.comb() 更快这些基准的时间,显示固定 n=256 和增加 k(直到 k = n // 2)的计算时间。

请注意,comb_reduce() 速度相当慢,但本质上与 @wich 的答案,而 comb_iter() 也相对较慢,本质上与 @ZXX 的答案

此处进行部分分析(不含 comb_math()< /code> 和 comb_perm() 因为它们在 Colab 的 Python 版本(3.7)中不受支持(截至上次编辑)。

For Python until 3.7:

def prod(items, start=1):
    for item in items:
        start *= item
    return start


def perm(n, k):
    if not 0 <= k <= n:
        raise ValueError(
            'Values must be non-negative and n >= k in perm(n, k)')
    else:
        return prod(range(n - k + 1, n + 1))


def comb(n, k):
    if not 0 <= k <= n:
        raise ValueError(
            'Values must be non-negative and n >= k in comb(n, k)')
    else:
        k = k if k < n - k else n - k
        return prod(range(n - k + 1, n + 1)) // math.factorial(k)

For Python 3.8+:


Interestingly enough, some manual implementation of the combination function may be faster than math.comb():

def math_comb(n, k):
    return math.comb(n, k)


def comb_perm(n, k):
    k = k if k < n - k else n - k
    return math.perm(n, k) // math.factorial(k)


def comb(n, k):
    k = k if k < n - k else n - k
    return prod(range(n - k + 1, n + 1)) // math.factorial(k)


def comb_other(n, k):
    k = k if k > n - k else n - k
    return prod(range(n - k + 1, n + 1)) // math.factorial(k)


def comb_reduce(n, k):
    k = k if k < n - k else n - k
    return functools.reduce(
        lambda x, y: x * y[0] // y[1],
        zip(range(n - k + 1, n + 1), range(1, k + 1)),
        1)


def comb_iter(n, k):
    k = k if k < n - k else n - k
    result = 1
    for i in range(1, k + 1):
        result = result * (n - i + 1) // i
    return result


def comb_iterdiv(n, k):
    k = k if k < n - k else n - k
    result = divider = 1
    for i in range(1, k + 1):
        result *= (n - i + 1)
        divider *= i
    return result // divider


def comb_fact(n, k):
    k = k if k < n - k else n - k
    return math.factorial(n) // math.factorial(n - k) // math.factorial(k)

bm

so that actually comb_perm() (implemented with math.perm() and math.factorial()) is actually faster than math.comb() most of the times for these benchamarks, which show the computation time for fixed n=256 and increasing k (up until k = n // 2).

Note that comb_reduce(), which is quite slow, is essentially the same approach as from @wich's answer, while comb_iter(), also relatively slow, is essentially the same approach as @ZXX's answer.

Partial analysis here (without comb_math() and comb_perm() since they are not supported in Python's version of Colab -- 3.7 -- as of last edit).

稚气少女 2024-08-25 06:04:46

如果您不需要纯 python 解决方案,gmpy2 可能会有所帮助(gmpy2 .comb 非常快)。

If you don't need a pure-python solution, gmpy2 might help (gmpy2.comb is very fast).

甲如呢乙后呢 2024-08-25 06:04:46

更有效的 nCr 解决方案 - 空间方面和精度方面。

中介 (res) 保证始终为 int 并且永远不会大于结果。空间复杂度为 O(1)(无列表、无 zip、无堆栈),时间复杂度为 O(r) - 恰好是 r 次乘法和 r 次除法。

def ncr(n, r):
    r = min(r, n-r)
    if r == 0: return 1
    res = 1
    for k in range(1,r+1):
        res = res*(n-k+1)/k
    return res

More efficient solution for nCr - space wise and precision wise.

The intermediary (res) is guaranteed to always be int and never larger than the result. Space complexity is O(1) (no lists, no zips, no stack), time complexity is O(r) - exactly r multiplications and r divisions.

def ncr(n, r):
    r = min(r, n-r)
    if r == 0: return 1
    res = 1
    for k in range(1,r+1):
        res = res*(n-k+1)/k
    return res
浅浅 2024-08-25 06:04:46
from scipy import misc
misc.comb(n, k)

应该允许你计算组合

from scipy import misc
misc.comb(n, k)

should allow you to count combinations

つ可否回来 2024-08-25 06:04:46

如果您正在计算 N 选择 K (这就是我认为您正在使用 ncr 所做的事情),则有一个动态编程解决方案可能会快得多。这将避免阶乘,而且如果您想以后使用,您可以保留该表。

这是一个教学链接:

http://www .csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html

不过,我不确定如何更好地解决您的第一个问题。

编辑:这是模型。有一些非常搞笑的差一错误,所以它肯定可以承受更多的清理。

import sys
n = int(sys.argv[1])+2#100
k = int(sys.argv[2])+1#20
table = [[0]*(n+2)]*(n+2)

for i in range(1,n):
    table[i][i] = 1
for i in range(1,n):
    for j in range(1,n-i):
        x = i+j
        if j == 1: table[x][j] = 1
        else: table[x][j] = table[x-1][j-1] + table[x-1][j]

print table[n][k]

If you are computing N choose K (which is what I think you're doing with ncr), there is a dynamic programming solution that may be a lot faster. This will avoid factorial, plus you can keep the table if you want for later use.

Here is a teaching link for it:

http://www.csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html

I am unsure of how to better solve your first problem, though, sorry.

Edit: Here is the mock-up. There are some pretty hilarious off-by-one errors, so it can certainly stand some more clean up.

import sys
n = int(sys.argv[1])+2#100
k = int(sys.argv[2])+1#20
table = [[0]*(n+2)]*(n+2)

for i in range(1,n):
    table[i][i] = 1
for i in range(1,n):
    for j in range(1,n-i):
        x = i+j
        if j == 1: table[x][j] = 1
        else: table[x][j] = table[x-1][j-1] + table[x-1][j]

print table[n][k]
_蜘蛛 2024-08-25 06:04:46

如果您的问题不需要知道排列或组合的确切数量,那么您可以使用斯特林近似< /a> 为阶乘。

这会导致这样的代码:

import math

def stirling(n):
    # http://en.wikipedia.org/wiki/Stirling%27s_approximation
    return math.sqrt(2*math.pi*n)*(n/math.e)**n

def npr(n,r):
    return (stirling(n)/stirling(n-r) if n>20 else
            math.factorial(n)/math.factorial(n-r))

def ncr(n,r):    
    return (stirling(n)/stirling(r)/stirling(n-r) if n>20 else
            math.factorial(n)/math.factorial(r)/math.factorial(n-r))

print(npr(3,2))
# 6
print(npr(100,20))
# 1.30426670868e+39
print(ncr(3,2))
# 3
print(ncr(100,20))
# 5.38333246453e+20

If your problem does not require knowing the exact number of permutations or combinations, then you could use Stirling's approximation for the factorial.

That would lead to code like this:

import math

def stirling(n):
    # http://en.wikipedia.org/wiki/Stirling%27s_approximation
    return math.sqrt(2*math.pi*n)*(n/math.e)**n

def npr(n,r):
    return (stirling(n)/stirling(n-r) if n>20 else
            math.factorial(n)/math.factorial(n-r))

def ncr(n,r):    
    return (stirling(n)/stirling(r)/stirling(n-r) if n>20 else
            math.factorial(n)/math.factorial(r)/math.factorial(n-r))

print(npr(3,2))
# 6
print(npr(100,20))
# 1.30426670868e+39
print(ncr(3,2))
# 3
print(ncr(100,20))
# 5.38333246453e+20
愛放△進行李 2024-08-25 06:04:46

使用 xrange() 代替 range() 会稍微加快速度,因为不会创建、填充、迭代然后销毁中间列表。此外,还有 reduce()operator.mul

Using xrange() instead of range() will speed things up slightly due to the fact that no intermediate list is created, populated, iterated through, and then destroyed. Also, reduce() with operator.mul.

坚持沉默 2024-08-25 06:04:46

对于 N 选择 K,您可以使用帕斯卡三角形。基本上,您需要保留大小为 N 的数组来计算所有 N 个选择的 K 值。只需要添加即可。

For N choose K you could use Pascals triangle. Basically you would need to keep array of size N around to compute all the N choose K values. Only additions would be required.

挽容 2024-08-25 06:04:46

您可以输入两个整数并导入数学库来查找阶乘,然后应用 nCr 公式

import math
n,r=[int(_)for _ in raw_input().split()]
f=math.factorial
print f(n)/f(r)/f(n-r)

You could input two integers and import math library to find the factorial and then apply the nCr formula

import math
n,r=[int(_)for _ in raw_input().split()]
f=math.factorial
print f(n)/f(r)/f(n-r)
‘画卷フ 2024-08-25 06:04:46
from numpy import prod

def nCr(n,r):
    numerator = range(n, max(n-r,r),-1)
    denominator = range(1, min(n-r,r) +1,1)
    return int(prod(numerator)/prod(denominator))
from numpy import prod

def nCr(n,r):
    numerator = range(n, max(n-r,r),-1)
    denominator = range(1, min(n-r,r) +1,1)
    return int(prod(numerator)/prod(denominator))
谁人与我共长歌 2024-08-25 06:04:46

很简单。只需从 math 模块导入 combperm 函数即可获得结果!

完整代码如下:

from math import comb, perm
n, r =12, 3
print(comb(n,r), perm(n,r))

Very simple. Just import comb and perm functions from math module and get the result!!

Complete code is below:

from math import comb, perm
n, r =12, 3
print(comb(n,r), perm(n,r))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文