通过递归公式改进纯Python素数筛

发布于 2024-09-11 00:36:24 字数 2603 浏览 7 评论 0原文

我试图通过取出子列表长度的复杂公式来进一步优化素数线程中的冠军解决方案。相同子序列的 len() 太慢，因为 len 很昂贵并且生成子序列也很昂贵。这看起来稍微加快了函数的速度，但我还无法取消除法，尽管我只在条件语句内进行除法。当然，我可以尝试通过取消对 n 而不是 n*n 的起始标记的优化来简化长度计算...

我将除法 / 替换为整数除法 // 以与 Python 3 兼容，或者

from __future__ import division

如果这个递归公式可以帮助加速 numpy 解决方案，但我没有太多使用 numpy 的经验。

如果您为代码启用 psyco，情况会变得完全不同，但是阿特金斯筛代码变得比这种特殊的切片技术更快。

import cProfile

def rwh_primes1(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def primes(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    # recurrence formula for length by amount1 and amount2 Tony Veijalainen 2010
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    amount1 = n-10
    amount2 = 6

    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i//2]:
             ## can you make recurrence formula for whole reciprocal?
            sieve[i*i//2::i] = [False] * (amount1//amount2+1)
        amount1-=4*i+4
        amount2+=4

    return [2] + [2*i+1 for i in xrange(1,n//2) if sieve[i]]

numprimes=1000000
print('Profiling')
cProfile.Profile.bias = 4e-6
for test in (rwh_primes1, primes):
    cProfile.run("test(numprimes)")

分析（版本之间差异不大）

         3 function calls in 0.191 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.191    0.191 <string>:1(<module>)
        1    0.185    0.185    0.185    0.185 myprimes.py:3(rwh_primes1)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         3 function calls in 0.192 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.192    0.192 <string>:1(<module>)
        1    0.186    0.186    0.186    0.186 myprimes.py:12(primes)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

有趣的是，通过将限制增加到 10**8 并将计时装饰器放入函数中以删除分析：

rwh_primes1 took 23.670 s
primes took 22.792 s
primesieve took 10.850 s

有趣的是，如果您不生成素数列表但返回筛子本身，则时间大约是数字的一半列表版本。

原文

I am trying to optimize further the champion solution in prime number thread by taking out the complex formula for sub-list length. len() of the same subsequence is too slow as len is expensive and generating the subsequence is expensive. This looks to slightly speed up the function but I could not yet take away the division, though I do the division only inside the condition statement. Of course I could try to simplify the length calculation by taking out the optimization of starting marking for n instead of n*n...

I replaced division / by integer division // to be compatible with Python 3 or

from __future__ import division

Also I would be interesting if this recurrence formula could help to speed up the numpy solution, but I have not experience of using numpy much.

If you enable psyco for the code, the story becomes completely different, however and Atkins sieve code becomes faster than this special slicing technique.

import cProfile

def rwh_primes1(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def primes(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    # recurrence formula for length by amount1 and amount2 Tony Veijalainen 2010
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    amount1 = n-10
    amount2 = 6

    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i//2]:
             ## can you make recurrence formula for whole reciprocal?
            sieve[i*i//2::i] = [False] * (amount1//amount2+1)
        amount1-=4*i+4
        amount2+=4

    return [2] + [2*i+1 for i in xrange(1,n//2) if sieve[i]]

numprimes=1000000
print('Profiling')
cProfile.Profile.bias = 4e-6
for test in (rwh_primes1, primes):
    cProfile.run("test(numprimes)")

Profiling (not so much difference between versions)

         3 function calls in 0.191 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.191    0.191 <string>:1(<module>)
        1    0.185    0.185    0.185    0.185 myprimes.py:3(rwh_primes1)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         3 function calls in 0.192 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.192    0.192 <string>:1(<module>)
        1    0.186    0.186    0.186    0.186 myprimes.py:12(primes)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Interestingly by increasing the limit to 10**8 and putting timing decorator to functions removing the profiling:

rwh_primes1 took 23.670 s
primes took 22.792 s
primesieve took 10.850 s

Interestingly if you do not produce list of primes but return the sieve itself the time is around half from number list version.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烟沫凡尘 2024-09-18 00:36:24

您可以进行车轮优化。 2 和 3 的倍数不是素数，所以根本不要存储它们。然后，您可以从 5 开始，通过以 2、4、2、4、2、4 等步长递增来跳过 2 和 3 的倍数。

下面是它的 C++ 代码。希望这有帮助。

void sieve23()
{
    int lim=sqrt(MAX);
    for(int i=5,bit1=0;i<=lim;i+=(bit1?4:2),bit1^=1)
    {
        if(!isComp[i/3])
        {
            for(int j=i,bit2=1;;)
            {
                j+=(bit2?4*i:2*i);
                bit2=!bit2;
                if(j>=MAX)break;
                isComp[j/3]=1;
            }
        }
    }
}

You could do a wheel optimisation. Multiples of 2 and 3 aren't primes, so dont store them at all. You could then start from 5 and skip multiples of 2 and 3 by incrementing in steps of 2,4,2,4,2,4 etc..

Below is a C++ code for it. Hope this helps.

void sieve23()
{
    int lim=sqrt(MAX);
    for(int i=5,bit1=0;i<=lim;i+=(bit1?4:2),bit1^=1)
    {
        if(!isComp[i/3])
        {
            for(int j=i,bit2=1;;)
            {
                j+=(bit2?4*i:2*i);
                bit2=!bit2;
                if(j>=MAX)break;
                isComp[j/3]=1;
            }
        }
    }
}

回复收藏 0 原文

停顿的约定 2024-09-18 00:36:24

如果您决定使用 C++ 来提高速度，我将 Python sieve 移植到了 C++。完整的讨论可以在这里找到：将优化的埃拉托斯特尼筛从Python 到 C++。

在 Intel Q6600、Ubuntu 10.10 上，使用 g++ -O3 编译且 N=100000000，这需要 415 毫秒。

#include <vector>
#include <boost/dynamic_bitset.hpp>

// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
    unsigned long rem = 0;
    unsigned long root = 0;

    for (short i = 0; i < 16; i++) {
        root <<= 1;
        rem = ((rem << 2) + (a >> 30));
        a <<= 2;
        root++;

        if (root <= rem) {
            rem -= root;
            root++;
        } else root--;

    }

    return static_cast<unsigned short> (root >> 1);
}

// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
    T i, j, k, sievemax, sievemaxroot;

    sievemax = N/3;
    if ((N % 6) == 2) sievemax++;

    sievemaxroot = isqrt(N)/3;

    boost::dynamic_bitset<> sieve(sievemax);
    sieve.set();
    sieve[0] = 0;

    for (i = 0; i <= sievemaxroot; i++) {
        if (sieve[i]) {
            k = (3*i + 1) | 1;
            for (j = k*k/3; j < sievemax; j += 2*k) sieve[j] = 0;
            for (j = (k*k+4*k-2*k*(i&1))/3; j < sievemax; j += 2*k) sieve[j] = 0;
        }
    }

    primes.push_back(2);
    primes.push_back(3);

    for (i = 0; i < sievemax; i++) {
        if (sieve[i]) primes.push_back((3*i+1)|1);
    }

}

If you may decide you are going to C++ to improve the speed, I ported the Python sieve to C++. The full discussion can be found here: Porting optimized Sieve of Eratosthenes from Python to C++.

On Intel Q6600, Ubuntu 10.10, compiled with g++ -O3 and with N=100000000 this takes 415 ms.

#include <vector>
#include <boost/dynamic_bitset.hpp>

// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
    unsigned long rem = 0;
    unsigned long root = 0;

    for (short i = 0; i < 16; i++) {
        root <<= 1;
        rem = ((rem << 2) + (a >> 30));
        a <<= 2;
        root++;

        if (root <= rem) {
            rem -= root;
            root++;
        } else root--;

    }

    return static_cast<unsigned short> (root >> 1);
}

// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
    T i, j, k, sievemax, sievemaxroot;

    sievemax = N/3;
    if ((N % 6) == 2) sievemax++;

    sievemaxroot = isqrt(N)/3;

    boost::dynamic_bitset<> sieve(sievemax);
    sieve.set();
    sieve[0] = 0;

    for (i = 0; i <= sievemaxroot; i++) {
        if (sieve[i]) {
            k = (3*i + 1) | 1;
            for (j = k*k/3; j < sievemax; j += 2*k) sieve[j] = 0;
            for (j = (k*k+4*k-2*k*(i&1))/3; j < sievemax; j += 2*k) sieve[j] = 0;
        }
    }

    primes.push_back(2);
    primes.push_back(3);

    for (i = 0; i < sievemax; i++) {
        if (sieve[i]) primes.push_back((3*i+1)|1);
    }

}

回复收藏 0 原文

~没有更多了~