DJ Bernstein 的《primegen》中 10^15 的限制在哪里？程序从何而来？

发布于 2024-12-06 22:48:17 字数 1014 浏览 4 评论 0原文

在 http://cr.yp.to/primegen.html 您可以找到以下程序的来源：使用阿特金筛来生成素数。正如作者所说，可能需要几个月的时间才能回复发送给他的电子邮件（我明白，他确实是一个忙碌的人！）我发布了这个问题。

该页面指出“primegen 可以生成最多 1000000000000000 个素数”。我试图理解为什么会这样。当然，最大限制是 2^64 ~ 2 * 10^19（long unsigned int 的大小），因为这就是数字的表示方式。我确信如果存在巨大的素数间隙（> 2^31），那么数字的打印将会失败。然而，在这个范围内，我认为不存在这样的素数差距。

要么是作者高估了界限（实际上大约是 10^19），要么是源代码中的某个地方算术运算可能会溢出或类似的情况。

有趣的是，你实际上可以运行它来获取数字 > 。 10^15：

./primes 10000000000000000 10000000000000100
10000000000000061
10000000000000069
10000000000000079
10000000000000099

如果您相信 Wolfram Alpha，那么它是正确的。

我进行了“逆向工程”的一些事实：

数字按 1,920 * PRIMEGEN_WORDS = 3,932,160 个数字的批次进行筛选（请参阅 primegen_next.c 中的 primegen_fill 函数）
PRIMEGEN_WORDS 控制单次筛选的大小 - 您可以在 primegen_impl.h 中调整它以适应你的CPU缓存，
筛子本身的实现在primegen.c文件中 - 我假设这是正确的；你得到的是 pg->buf 中素数的位掩码（参见 primegen_fill 函数）。
分析位掩码并将素数存储在 pg->p 数组中。

我认为没有任何可能发生溢出的地方。

原文

At http://cr.yp.to/primegen.html you can find sources of program that uses Atkin's sieve to generate primes. As the author says that it may take few months to answer an e-mail sent to him (I understand that, he sure is an occupied man!) I'm posting this question.

The page states that 'primegen can generate primes up to 1000000000000000'. I am trying to understand why it is so. There is of course a limitation up to 2^64 ~ 2 * 10^19 (size of long unsigned int) because this is how the numbers are represented. I know for sure that if there would be a huge prime gap (> 2^31) then printing of numbers would fail. However in this range I think there is no such prime gap.

Either the author overestimated the bound (and really it is around 10^19) or there is a place in the source code where the arithmetic operation can overflow or something like that.

The funny thing is that you actually MAY run it for numbers > 10^15:

./primes 10000000000000000 10000000000000100
10000000000000061
10000000000000069
10000000000000079
10000000000000099

and if you believe Wolfram Alpha, it is correct.

Some facts I had "reverse-engineered":

numbers are sifted in batches of 1,920 * PRIMEGEN_WORDS = 3,932,160 numbers (see primegen_fill function in primegen_next.c)
PRIMEGEN_WORDS controls how big a single sifting is - you can adjust it in primegen_impl.h to fit your CPU cache,
the implementation of the sieve itself is in primegen.c file - I assume it is correct; what you get is a bitmask of primes in pg->buf (see primegen_fill function)
The bitmask is analyzed and primes are stored in pg->p array.

I see no point where the overflow may happen.

分享到QQ

分享到微博