埃拉托色尼筛法 - 寻找素数 Python

发布于 2024-09-27 17:20:38 字数 1053 浏览 7 评论 0原文

只是为了澄清，这不是一个家庭作业问题:)

我想为我正在构建的数学应用程序找到素数遇到了埃拉托斯特尼筛法方法。

我已经用 Python 编写了它的实现。但速度非常慢。比如说，如果我想找到所有小于 200 万的素数。需要> 20分钟（我此时停止了）。我怎样才能加快速度？

def primes_sieve(limit):
    limitn = limit+1
    primes = range(2, limitn)

    for i in primes:
        factors = range(i, limitn, i)
        for f in factors[1:]:
            if f in primes:
                primes.remove(f)
    return primes

print primes_sieve(2000)

更新： 我最终对这段代码进行了分析&发现从列表中删除一个元素花费了相当多的时间。考虑到它必须遍历整个列表（最坏情况）才能找到元素和元素，这是非常可以理解的。然后删除它，然后重新调整列表（也许还有一些副本？）。不管怎样，我把字典的清单扔掉了。我的新实施 -

def primes_sieve1(limit):
    limitn = limit+1
    primes = dict()
    for i in range(2, limitn): primes[i] = True

    for i in primes:
        factors = range(i,limitn, i)
        for f in factors[1:]:
            primes[f] = False
    return [i for i in primes if primes[i]==True]

print primes_sieve1(2000000)

原文

Just to clarify, this is not a homework problem :)

I wanted to find primes for a math application I am building & came across Sieve of Eratosthenes approach.

I have written an implementation of it in Python. But it's terribly slow. For say, if I want to find all primes less than 2 million. It takes > 20 mins. (I stopped it at this point). How can I speed this up?

def primes_sieve(limit):
    limitn = limit+1
    primes = range(2, limitn)

    for i in primes:
        factors = range(i, limitn, i)
        for f in factors[1:]:
            if f in primes:
                primes.remove(f)
    return primes

print primes_sieve(2000)

UPDATE:
I ended up doing profiling on this code & found that quite a lot of time was spent on removing an element from the list. Quite understandable considering it has to traverse the entire list (worst-case) to find the element & then remove it and then readjust the list (maybe some copy goes on?). Anyway, I chucked out list for dictionary. My new implementation -

def primes_sieve1(limit):
    limitn = limit+1
    primes = dict()
    for i in range(2, limitn): primes[i] = True

    for i in primes:
        factors = range(i,limitn, i)
        for f in factors[1:]:
            primes[f] = False
    return [i for i in primes if primes[i]==True]

print primes_sieve1(2000000)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

水水月牙 2024-10-04 17:20:38

您没有完全实现正确的算法：

在第一个示例中，primes_sieve 不维护要删除/取消设置的素数标志列表（如算法中所示），而是调整整数列表的大小连续地，这是非常昂贵的：从列表中删除一个项目需要将所有后续项目向下移动一位。

在第二个示例中，primes_sieve1 维护一个素数标志的字典，这是朝着正确方向迈出的一步，但它以未定义的顺序迭代字典，并冗余地删除因子的因子（而不是像算法中那样仅是素数的因子）。您可以通过对键进行排序并跳过非素数来解决此问题（这已经使其速度快了一个数量级），但直接使用列表仍然更有效。

正确的算法（使用列表而不是字典）类似于：（

def primes_sieve2(limit):
    a = [True] * limit                          # Initialize the primality list
    a[0] = a[1] = False

    for (i, isprime) in enumerate(a):
        if isprime:
            yield i
            for n in range(i*i, limit, i):     # Mark factors non-prime
                a[n] = False

请注意，这还包括在素数平方 (i*i) 处开始非素数标记的算法优化，而不是它的两倍。）

You're not quite implementing the correct algorithm:

In your first example, primes_sieve doesn't maintain a list of primality flags to strike/unset (as in the algorithm), but instead resizes a list of integers continuously, which is very expensive: removing an item from a list requires shifting all subsequent items down by one.

In the second example, primes_sieve1 maintains a dictionary of primality flags, which is a step in the right direction, but it iterates over the dictionary in undefined order, and redundantly strikes out factors of factors (instead of only factors of primes, as in the algorithm). You could fix this by sorting the keys, and skipping non-primes (which already makes it an order of magnitude faster), but it's still much more efficient to just use a list directly.

The correct algorithm (with a list instead of a dictionary) looks something like:

def primes_sieve2(limit):
    a = [True] * limit                          # Initialize the primality list
    a[0] = a[1] = False

    for (i, isprime) in enumerate(a):
        if isprime:
            yield i
            for n in range(i*i, limit, i):     # Mark factors non-prime
                a[n] = False

(Note that this also includes the algorithmic optimization of starting the non-prime marking at the prime's square (i*i) instead of its double.)

埃拉托色尼筛法 - 寻找素数 Python

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（27）

埃拉托斯特尼筛法的各种方法的实证分析和可视化

Naive SOE

次优 SOE

快速 SOE

实证分析

Empirical Analysis and Visualization of Various Approaches to the Sieve of Eratosthenes

Naive SOE

Suboptimal SOE

Fast SOE

Empirical Analysis

Bitarray 和 6k±1 的大小和速度

Bitarray and 6k±1 for size and speed

基本筛选的

分段筛（使用更少的内存）

Basic sieve

Segment sieve (use less memory)

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。