对小数的最快素数测试
我在业余时间玩了 Euler 项目,现在我需要做一些重构。我已经实施了 Miller-Rabin 以及一些筛子。我以前听说过,对于较小的数量(例如数百万以下),筛子实际上更快。有人有这方面的信息吗?谷歌并没有多大帮助。
I'm playing through project Euler in my spare time, and it's come to the point where I need to do some refactoring. I've implemented Miller-Rabin, as well as a few sieves. I've heard before that sieves are actually faster for small-ish numbers, as in under a few million. Does anybody have any information on this? Google wasn't very helpful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我建议采用分层方法。首先,确保没有小的质因数。试除前 20 或 30 个素数是可行的,但如果您使用巧妙的方法,则可以通过使用 gcd 来减少所需的除法次数。此步骤过滤掉大约 90% 的复合材料。
接下来,测试该数字是否是以 2 为基数的强可能素数(米勒-拉宾测试)。此步骤删除了几乎所有剩余的复合材料,但一些罕见的复合材料可以通过。
最后的证明步骤取决于您想要的规模。如果您愿意在小范围内工作,请在 2-伪素数列表上进行二分搜索,直到您允许的最大范围。如果是 2^32,则您的列表将只有 10,403 个成员,因此查找应该只需要 14 次查询。
如果你想达到 2^64,现在就足够了(感谢 Jan Feitisma) 检查该数字是否为 BPSW 伪素数。 (您还可以下载所有例外的 3 GB 列表,删除审判分割将删除的那些,并编写基于磁盘的二分搜索。) TR Nicely 有一个很好的页面解释了如何合理有效地实现这一点。
如果您需要更高,请实现上述方法并将其用作波克林顿式测试的子例程。这延伸了“小”的定义;如果您想了解有关这些方法的更多信息,请直接询问。
I recommend a tiered approach. First, make sure there are no small prime factors. Trial-dividing by the first 20 or 30 primes works, though if you use a clever approach you can reduce the number of divisions needed by using gcds. This step filters out about 90% of the composites.
Next, test if the number is a strong probable prime (Miller-Rabin test) to base 2. This step removes almost all remaining composites, but some rare composites can pass.
The final proving step depends on how large you want to go. If you are willing to work in a small range, do a binary search on a list of 2-pseudoprimes up the the largest you allow. If that's 2^32, your list will have only 10,403 members, so the lookup should take only 14 queries.
If you want to go up to 2^64, it now suffices (thanks to the work of Jan Feitisma) to check if the number is a BPSW pseudoprime. (You could also download the 3 GB list of all exceptions, remove those which trial division would remove, and write a disk-based binary search.) T. R. Nicely has a nice page explaining how to implement this reasonably efficiently.
If you need to go higher, implement the above method and use it as a subroutine for a Pocklington-style test. This stretches the definition of "small-ish"; if you want more information on these methods, just ask.
作为预计算概念的变体,您可以首先简单地检查候选数
p
是否能被 2、3、5、7 或 11 整除。如果不能,则声明如果 2p-1 = 1 (mod p),则 p 为质数。这在某些时候会失败,但它最多可以工作 1 亿,因为我测试了它(预计算)。
换句话说,所有以 2 为底的小费马伪素数都可以被 3、5、7 或 11 之一整除。
编辑:
正如 @starblue 所正确指出的,上面的内容完全是错误的。我的程序中有一个错误。我能做的最好的就是将上面的内容修改为:
如果候选
p
可以被 2、3、5、7 或 11 整除,则将其声明为合数;否则,如果
p
是 {4181921, 4469471, 5256091, 9006401, 9863461} 之一,则声明它是复合的;否则,如果
p
通过了碱基 2 和 5 的 Miller-Rabin 测试,则声明它是素数;否则声明它是复合的。
我测试了小于 10,000,000 的整数。也许一对不同的碱基会做得更好。
请接受我对我的错误的歉意。
编辑2:
嗯,看来我想要的信息已经在维基百科页面上的 Miller-Rabin 算法,标题为 " 的部分测试的确定性变体”。
As a variant on the notion of pre-computation, you can first cheaply check whether the candidate number
p
is divisible by 2, 3, 5, 7, or 11. If not, then declarep
prime if 2p-1 = 1 (mod p). This will fail at some point, but it works up to 100 million because I tested it (pre-computation).In other words, all the small-ish Fermat pseudo-primes to the base 2 are divisible by one of 3, 5, 7, or 11.
EDIT:
As correctly noted by @starblue, the above is simply wrong. I had a bug in my program. The best I can do is amend the above to:
If candidate
p
is divisible by 2, 3, 5, 7, or 11, declare it composite;Else if
p
is one of {4181921, 4469471, 5256091, 9006401, 9863461}, declare it composite;Else if
p
passes the Miller-Rabin test for bases 2 and 5 then declare it prime;Else declare it composite.
This I tested for integers less than 10,000,000. Perhaps a different pair of bases would do even better.
Please accept my apologies for my mistakes.
EDIT 2:
Well, it appears that the information I was after is already on the Wikipedia page for the Miller-Rabin algorithm, the section titled "Deterministic variants of the test".
唯一的方法就是对标自己。当你这样做时,把它写下来,然后发布到网上的某个地方。
The only way is to benchmark yourself. When you do, write it up, and post it online somewhere.
假设
n < 4669921
它会非常快:Assuming
n < 4669921
it would be very fast :是的,您会发现大多数算法都可以用空间换取时间。换句话说,通过允许使用更多内存,速度会大大提高*a。
我实际上不知道 Miller-Rabin 算法,但是,除非它比单个左移/添加和内存提取更简单,否则它会被预先计算的筛子从水中吹出。
这里重要的是预先计算的。就性能而言,预先计算这样的事情是一个好主意,因为前一百万个素数在不久的将来不太可能改变:-)
换句话说,用类似的东西创建你的筛子:
带有所有常见的警告关于不将
a++
等内容传递到宏中。这为您提供了两全其美的功能,即对“小”素数进行极其快速的表查找,对于范围之外的素数则返回到计算方法。显然,您将使用其他方法之一编写一个程序来生成该查找表 - 您真的不想手动输入所有内容。
但是,与所有优化问题一样,测量,不要猜测!
*a 一个典型的案例是我曾经为嵌入式系统编写的一些三角函数。这是一个竞争性的合同投标,系统的存储空间比 CPU grunt 多一点。
我们实际上赢得了合同,因为我们的功能基准数据击败了竞争对手。
为什么?因为我们预先将这些值计算到最初在另一台机器上计算的查找表中。通过明智地使用归约(将输入值降低到 90 度以下)和三角属性(余弦只是正弦的相移,并且其他三个象限与第一个象限相关),我们将查找表简化为180 个条目(每半度一个)。
最好的解决方案是那些优雅且狡猾的解决方案:-)
不管怎样,下面的 C 代码将为您生成这样一个表,其中包含 400 万以下的所有素数(其中 283,000 个)。
如果您可以将
primeTbl
表增加到 1600 万个条目 (16M),您会发现这足以将素数计数保持在 100 万以上(前 1,031,130 个素数)。现在有一些方法可以减少存储空间,例如仅存储奇数并调整宏来解决这个问题,或者使用位掩码而不是无符号字符。如果内存可用的话,我自己更喜欢算法的简单性。
Yes, you'll find with most algorithms that you can trade space for time. In other words, by allowing the use of more memory, the speed is greatly increased *a.
I don't actually know the Miller-Rabin algorithm but, unless it's simpler than a single shift-left/add and memory extraction, it will be blown out of the water by a pre-calculated sieve.
The important thing here is pre-calculated. It's a good idea, in terms of performance, to pre-calculate things like this since the first million primes will be unlikely to change in the near future :-)
In other words, create your sieve with something like:
with all the usual caveats about not passing things like
a++
into macros. This gives you the best of both worlds, a blindingly fast table lookup for "small-ish" primes, dropping back to a calculation method for those outside the range.Obviously you would write a program using one of the other methods to generate that lookup table - you don't really want to have to type it all in by hand.
But, as with all optimisation questions, measure, don't guess!
*a A classic case of this was some trig functions I once had to write for an embedded system. This was a competitive contract bid and the system had a little more storage than CPU grunt.
We actually won the contract since our benchmark figures for the functions blew the competition away.
Why? Because we pre-calculated the values into a lookup table originally calculated on another machine. By judicious use of reduction (bringing the input values down below 90 degrees) and trig properties (the fact that cosine is just a phase shift of sine and that the other three quadrants are related to the first), we got the lookup table down to 180 entries (one per half degree).
The best solutions are those that are elegant and devious :-)
For what it's worth, the following C code will generate such a table for you, all the primes below four million (283,000 of them).
If you can bump up the
primeTbl
table to sixteen million entries (16M), you'll find that's enough to keep the prime count above a million (the first 1,031,130 primes).Now there are ways to make that take less storage such as only storing odd numbers and adjusting the macro to take care of that, or using a bit mask instead of unsigned characters. I prefer simplicity of algorithms myself if the memory is available.