BigInteger.isProbablePrime 似乎比它所说的更确定

发布于 2025-01-15 21:06:46 字数 1634 浏览 4 评论 0原文

我理解 确定性 参数的意思是:

确定性 - 调用者愿意容忍的不确定性的度量:如果调用返回 true,则此 BigInteger 为素数的概率超过 (1 - 1/2确定性)

从我的实验来看,似乎超过它相当多!下面的代码查找 2 到 100 万之间的“可能的素数”,并检查一组确定的素数以查看是否为误报。

我使用的确定性参数为 2。因此,我预计只有 75% 的“可能素数”是实际素数。 (1 - 1/22 = 0.75 = 75%。)

实际上,99.9% 的情况下它都是正确的。

我对“确定性”含义的理解是否正确?我怀疑,如果我在实验中看到的确定性超出了我的预期,那么情况可能并非如此。

import java.math.BigInteger;
import java.util.BitSet;

import static java.lang.Math.sqrt;

public class PrimesCalculator {
    public final int max;
    private final BitSet sieve;  // Set of all non-primes from 2 to max.

    public PrimesCalculator(int max) {
        this.max = max;
        sieve = new BitSet(max+1);
        for (int n = 2, sqrtMax = (int) sqrt(max); n < sqrtMax; n++)
            for (int i = n * 2; i < max; i += n)
                sieve.set(i);
    }

    public boolean isPrime(int n) {
        return !sieve.get(n);
    }

    public static void main(String[] args) {
        PrimesCalculator calc = new PrimesCalculator(1_000_000);
        int numPrimes = 0;
        int numProbablePrimes = 0;
        for (int i = 2; i < calc.max; i++)
            if (BigInteger.valueOf(i).isProbablePrime(2)) {
                numProbablePrimes++;
                if (calc.isPrime(i))
                    numPrimes++;
            }
        System.out.printf("%s/%s (%s%%)%n", numPrimes, numProbablePrimes, numPrimes / (numProbablePrimes / 100.0));
    }
}

I understand the certainty argument to mean:

certainty - a measure of the uncertainty that the caller is willing to tolerate: if the call returns true the probability that this BigInteger is prime exceeds (1 - 1/2certainty)

From my experiments, it seems to exceed it by quite a lot! The code below finds "probable primes" between 2 and 1 million and checks against a set of definite primes to see if it was a false positive.

I'm using a certainty argument of 2. I therefore expect that only 75% of "probable primes" will be actual primes. (1 - 1/22 = 0.75 = 75%.)

Actually, it gets it right 99.9% of the time.

Is my understanding of the meaning of "certainty" correct? I suspect it might not be if the certainty I've seen experimentally exceeds my expectation by so much.

import java.math.BigInteger;
import java.util.BitSet;

import static java.lang.Math.sqrt;

public class PrimesCalculator {
    public final int max;
    private final BitSet sieve;  // Set of all non-primes from 2 to max.

    public PrimesCalculator(int max) {
        this.max = max;
        sieve = new BitSet(max+1);
        for (int n = 2, sqrtMax = (int) sqrt(max); n < sqrtMax; n++)
            for (int i = n * 2; i < max; i += n)
                sieve.set(i);
    }

    public boolean isPrime(int n) {
        return !sieve.get(n);
    }

    public static void main(String[] args) {
        PrimesCalculator calc = new PrimesCalculator(1_000_000);
        int numPrimes = 0;
        int numProbablePrimes = 0;
        for (int i = 2; i < calc.max; i++)
            if (BigInteger.valueOf(i).isProbablePrime(2)) {
                numProbablePrimes++;
                if (calc.isPrime(i))
                    numPrimes++;
            }
        System.out.printf("%s/%s (%s%%)%n", numPrimes, numProbablePrimes, numPrimes / (numProbablePrimes / 100.0));
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

惯饮孤独 2025-01-22 21:06:46

这些陈述经常引起混乱。我将尝试在这里更好地解释它。

Java 的 BigInteger.isProbablePrime() 根本不包含对小整数的优化,除了 0、1 和 2 被视为特殊情况,并且除 2 之外的所有偶数都立即声明为复合整数。从撰写本文时起,这将是正确的。

使用 Miller-Rabin (MR) 素性测试检查所有其他奇数整数的素性。此外,如果整数为 100 位或更大,还会使用称为 Lucas-Lehmer 测试的方法进行检查。

MR 是一个复杂的素性测试,其解释和分析超出了 SO 答案的范围。关键是,就 MR 而言,并非所有复合材料都是一样的。非常非常小的一部分是顽固的,这意味着要发现它们的复合性要困难得多。举几个例子:在小奇数组合中,91、703 和 1891 是顽固的。 MR 通过尝试多次随机尝试来发现整数的复合性,从而克服了这个问题。 Rabin 的分析表明,对于最顽固的复合材料,单次随机尝试仍有至少 75% (3/4) 的概率揭示其复合性。

该算法无法知道哪些复合材料是顽固的。对于密码学来说,得到一个你被告知是质数但实际上是复合数的数字可能会带来灾难性的后果。因此,为了安全起见,确定性公式假设它测试素数的所有数字都是顽固的合数或实素数。 certainty 参数几乎等同于指定 MR 算法需要执行的随机尝试次数。实际上,Java 实现中的确定性参数与随机尝试次数之间的关系似乎更复杂,仅通过阅读源代码我自己并不能完全理解它。

但请记住我说过,顽固的复合材料的比例非常非常小。您测试的几乎每个复合材料都不是顽固的,并且随着素数变大,仅在一次随机尝试后就会失败 MR。这就是为什么你的成功率看起来是 99.9%。

作为了解不同复合材料表现如何的实验,请将程序更改为重复尝试确认(例如 1891 年)的复合性。您将看到接近仅 75% 成功的结果。

此处列出了相对较小的 MR 顽固复合材料列表。

These statements frequently cause confusion. I will attempt to explain it a little better here.

Java's BigInteger.isProbablePrime() contains no optimizations for small integers at all, except 0, 1, and 2 are treated as special cases and all even integers except 2 are immediately declared composite. This will be true as of the time this was written.

All other odd integers are checked for primality using the Miller-Rabin (MR) primality test. In addition, if the integer is 100 bits or larger it is also checked with something called the Lucas-Lehmer test.

MR is a complicated primality test whose explanation and analysis are beyond the scope of an SO answer. The key point is that, as far as MR is concerned, not all composites are created equal. A very, very tiny fraction are stubborn, which means it's much, much harder to discover their compositeness. A few examples: amongst the small odd composites, 91, 703, and 1891 are stubborn. MR overcomes this by trying multiple randomized attempts to discover the compositeness of an integer. Rabin's analysis shows that, for the most stubborn composites, a single randomized attempt still has at least a 75% (3/4) probability of revealing its compositeness.

The algorithm has no way to know which composites are stubborn. And for cryptography, getting a number you're told is prime but is actually composite can be potentially disastrous. So, for safety, the formula for certainty assumes that all the numbers it tests for primality are are stubborn composites or real primes. The certainty argument is almost equivalent to specifying the number of randomized attempts that the MR algorithm needs to perform. In reality, the relationship between the certainty argument and the number of randomized attempts in the Java implementation seems more complicated and I don't completely understand it myself just from reading the source code.

But remember I said that the fraction of composites that are stubborn is very, very small. Almost every composite you test is not stubborn, and will fail MR after only a single randomized attempt as primes get larger. That's why your success rate appears to be 99.9%.

As an experiment to see how different composites perform, change your program to instead just repeatedly try confirming the compositeness of, say, 1891. You will see something closer to only 75% success.

A list of relatively small MR-stubborn composites is here.

鸩远一方 2025-01-22 21:06:46

您引用的文档是正确的。

确定性 - 调用者愿意容忍的不确定性的度量:如果调用返回 true,则此 BigInteger 为素数的概率超过 (1 - 1/2确定性)

99.9% 确实超过 1 - 1 /(22) = 3/4,所以您向我们展示的内容没有任何问题。

该实现不保证这完全概率,它只是提供了一个实现,其错误明确地受到该确定性的限制。

大多数质量素性测试人员都会对小素数进行大量优化,或者更确切地说,对除数是小合数的数字进行大量优化。这些可能在算法的随机方面之前启动,从而导致小素数的准确度高于平常。

The documentation you've cited is correct.

certainty - a measure of the uncertainty that the caller is willing to tolerate: if the call returns true the probability that this BigInteger is prime exceeds (1 - 1/2certainty)

99.9% indeed exceeds 1 - 1/(22) = 3/4, so there's nothing wrong with what you've shown us.

The implementation makes no guarantees that that's exactly the probability, it just provides an implementation whose error is definitely bounded by that certainty.

Most quality primality testers will have lots of optimizations for small primes, or rather, numbers whose divisors are small composite numbers. These likely kick in before the random aspects of the algorithm, resulting in higher-than-usual accuracy for small primes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文