使用 gmp 有效地分解大数

发布于 2024-10-04 17:11:30 字数 350 浏览 5 评论 0原文

我需要获取可以轻松达到 1k 位的大数的所有素因数。 这些数字实际上是随机的,所以应该不难。 我如何高效地做到这一点?我使用带有 GMP 库的 C++。

编辑: 我想你们都误解了我的意思。
我所说的素数的意思是得到该数的所有素因数。
对不起我的英语,在我的语言中,质数和因数是相同的:)


澄清(来自OP的其他帖子):

我需要的是一种有效因式分解(找到数字的质因数)大数(可能达到2048位)的方法使用 C++ 和 GMP(Gnu Multiple Precession lib)或不太优选的任何其他方式。 这些数字实际上是随机的,因此几乎不可能难以分解,即使数字难以分解,我也可以重新滚动该数字(但不能选择)。

I need to get all the prime factors of large numbers that can easily get to 1k bits.
The numbers are practically random so it shouldn't be hard.
How do I do it efficiently? I use C++ with GMP library.

EDIT:
I guess you all misunderstood me.
What I mean by prime a number is to get all prime factors of the number.
Sorry for my english, in my language prime and factor are the same :)


clarification (from OP's other post):

What I need is a way to efficiently factor(find prime factors of a number) large numbers(may get to 2048 bits) using C++ and GMP(Gnu Multiple Precession lib) or less preferably any other way.
The numbers are practically random so there is little chance it will be hard to factor, and even if the number is hard to factor, I can re-roll the number(can't choose though).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

乜一 2024-10-11 17:11:30

一个好的开始是对小素数进行一些预过滤,例如所有低于 100 000 左右的素数。只需尝试除以其中的每一个(创建一个表,然后在运行时加载该表或将其作为代码中的静态数据)。它可能看起来很慢而且很愚蠢,但如果数字是完全随机的,这将以很大的概率非常快地给你一些因素。然后查看剩余的数量并决定下一步做什么。如果它很小(“小”意味着什么取决于你),你可以尝试素数测试(我认为 GMP 中有一些东西),如果它给出它是素数,那么在大多数情况下你可以信任它。否则你必须进一步考虑它。

如果您的数字确实很大并且您关心性能,那么您肯定需要实现一些比愚蠢的除法更复杂的东西。看看二次筛(尝试维基百科)。它非常简单但非常强大。如果您能够应对挑战,请尝试 MPQS,它是二次筛算法的一种变体。 此论坛是一个很好的信息来源。甚至还有您需要的工具的现有实现 - 参见示例

请注意,1k 位的数字无论如何都是巨大的。如果您幸运的话,对这样一个数字进行因式分解(即使使用 MPQS 或其他方法)可能需要数年时间,否则可能会花费一辈子。我认为 MPQS 对于大约 100-400 位的数字表现良好(如果它们由两个几乎同样大的素数组成,这当然是最难的情况)。

A good start would be some pre-filtering with small primes, say about all primes lower than 100 000 or so. Simply try to divide by every single one of them (create a table which you then load at runtime or have it as static data in your code). It might seem slow and stupid, but if the number is totally random, this will give you some factors very fast with a huge probability. Then look at the remaining number and decide what to do next. If it is quite small (what "small" means is up to you) you could try a primality test (there is something in GMP i think) and if it gives it is a prime, you can in most of the cases trust it. Otherwise you have to factor it further.

If your numbers are really huge and you care about performance, then you definitely need to implement something more sophisticated than just a stupid division. Look at Quadratic Sieve (try wikipedia). It is quite simple but very powerful. If you are up to the chalenge, try MPQS, a variant of the quadratic sieve algorithm. This forum is a good source of information. There are even existing implementations of a tool you need - see for example this.

Note though that numbers with 1k bits are huge by all means. Factoring such a number (even with MPQS or others) might take years if you are lucky and forever if not. I think that MPQS performs well with numbers of about 100-400 bits (if they are composed of two primes almost equally large, which is the hardest case of course).

雨的味道风的声音 2024-10-11 17:11:30

下面是 Java 中的示例算法(它不是带有 GMP 的 C++,但转换应该非常简单):

  1. 生成位长 Nbits 的随机数 x
  2. 尝试分解所有质因数 < 100,保留除 x 的质因数列表。
  3. 使用 Java 的 isProbablePrime 方法
  4. 如果剩余的因式乘积以足够的概率为素数,则我们成功分解了 x。 (STOP)
  5. 否则,剩余的因子乘积肯定是复合的(请参阅 isProbablePrime 文档)。
  6. 当我们还有时间时,我们运行 Pollard rho 算法,直到找到除数 d。
  7. 如果我们没有时间,我们就失败了。 (STOP)
  8. 我们找到了除数 d。所以我们分解出 d,将 d 的质因数添加到 x 的质因数列表中,然后转到步骤 4。

该算法的所有参数都在程序列表的开头附近。我寻找 1024 位随机数,超时为 250 毫秒,然后继续运行程序,直到得到一个至少有 4 个质因数的数字 x(有时程序会找到一个有 1、2 或 3 个质因数的数字)第一的)。使用此参数设置,在我的 2.66Ghz iMac 上通常需要大约 15-20 秒。

Pollard 的 rho 算法并不是那么高效,但与二次筛 (QS )或通用数域筛(GNFS)——我只是想看看简单的算法如何工作了。


为什么这有效:(尽管你们中的许多人声称这是一个难题)

简单的事实是,素数并不罕见。对于 1024 位数字,素数定理表示每 1024 ln 2 中大约有 1 个(= 大约 710)
数字是素数。

因此,如果我生成一个随机数 x,它是素数,并且我接受概率素数检测,那么我就成功分解了 x。

如果它不是素数,但我很快分解出了几个小因子,而剩下的因子(概率上)是素数,那么我就成功分解了 x 。

否则我就会放弃并生成一个新的随机数。 (OP说这是可以接受的)

大多数成功分解的数字将有1个大质因数和一些小质因数。

难以分解的数字是那些具有不小的质因数和至少 2 个大质因数的数字(其中包括作为两个大数乘积的加密密钥;OP 没有提到密码学),我可以当我没时间的时候跳过它们。

package com.example;

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class FindLargeRandomComposite {
    final static private int[] smallPrimes = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 
        31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 
        73, 79, 83, 89, 97};

    final static private int maxTime = 250;
    final static private int Nbits = 1024;
    final static private int minFactors = 4;
    final static private int NCERTAINTY = 4096;

    private interface Predicate { public boolean isTrue(); }

    static public void main(String[] args)
    {
        Random r = new Random();
        boolean found = false;
        BigInteger x=null;
        List<BigInteger> factors=null;
        long startTime = System.currentTimeMillis();
        while (!found)
        {
            x = new BigInteger(Nbits, r);
            factors = new ArrayList<BigInteger>();
            Predicate keepRunning = new Predicate() {
                final private long stopTime = System.currentTimeMillis() + maxTime;
                public boolean isTrue() {
                    return System.currentTimeMillis() < stopTime;
                }
            };
            found = factor(x, factors, keepRunning);
            System.out.println((found?(factors.size()+" factors "):"not factored ")+x+"= product: "+factors);
            if (factors.size() < minFactors)
                found = false;
        }
        long stopTime = System.currentTimeMillis();
        System.out.println("Product verification: "+(x.equals(product(factors))?"passed":"failed"));
        System.out.println("elapsed time: "+(stopTime-startTime)+" msec");
    }

    private static BigInteger product(List<BigInteger> factors) {
        BigInteger result = BigInteger.ONE;
        for (BigInteger f : factors)
            result = result.multiply(f);
        return result;
    }

    private static BigInteger findFactor(BigInteger x, List<BigInteger> factors,
            BigInteger divisor)
    {
        BigInteger[] qr = x.divideAndRemainder(divisor);
        if (qr[1].equals(BigInteger.ZERO))
        {
            factors.add(divisor);
            return qr[0];
        }
        else
            return x;
    }

    private static BigInteger findRepeatedFactor(BigInteger x,
            List<BigInteger> factors, BigInteger p) {
        BigInteger xprev = null;
        while (xprev != x)
        {
            xprev = x;
            x = findFactor(x, factors, p);
        }
        return x;
    }

    private static BigInteger f(BigInteger x, BigInteger n)
    {
        return x.multiply(x).add(BigInteger.ONE).mod(n);
    }

    private static BigInteger gcd(BigInteger a, BigInteger b) {
        while (!b.equals(BigInteger.ZERO))
        {
            BigInteger nextb = a.mod(b);
            a = b;
            b = nextb;
        }
        return a;
    }
    private static BigInteger tryPollardRho(BigInteger n,
            List<BigInteger> factors, Predicate keepRunning) {
        BigInteger x = new BigInteger("2");
        BigInteger y = x;
        BigInteger d = BigInteger.ONE;
        while (d.equals(BigInteger.ONE) && keepRunning.isTrue())
        {
            x = f(x,n);
            y = f(f(y,n),n);
            d = gcd(x.subtract(y).abs(), n);
        }
        if (d.equals(n))
            return x;
        BigInteger[] qr = n.divideAndRemainder(d);
        if (!qr[1].equals(BigInteger.ZERO))
            throw new IllegalStateException("Huh?");
        // d is a factor of x. But it may not be prime, so run it through the factoring algorithm.
        factor(d, factors, keepRunning);
        return qr[0];
    }

    private static boolean factor(BigInteger x0, List<BigInteger> factors,
            Predicate keepRunning) {

        BigInteger x = x0;
        for (int p0 : smallPrimes)
        {
            BigInteger p = new BigInteger(Integer.toString(p0));
            x = findRepeatedFactor(x, factors, p);          
        }
        boolean done = false;
        while (!done && keepRunning.isTrue())
        {
            done = x.equals(BigInteger.ONE) || x.isProbablePrime(NCERTAINTY);
            if (!done)
            {
                x = tryPollardRho(x, factors, keepRunning);
            }
        }
        if (!x.equals(BigInteger.ONE))
            factors.add(x);
        return done;
    }
}

Below is a sample algorithm in Java (it's not C++ with GMP, but converting should be pretty straightforward) that:

  1. generates a random number x of bitlength Nbits
  2. tries to factor out all prime factors < 100, keeping a list of prime factors that divide x.
  3. tests to see if the remaining factor is prime using Java's isProbablePrime method
  4. If the remaining factor product is prime with sufficient probability, we have succeeded in factoring x. (STOP)
  5. Otherwise the remaining factor product is definitely composite (see the isProbablePrime docs).
  6. While we still have time, we run the Pollard rho algorithm until we find a divisor d.
  7. If we run out of time, we have failed. (STOP)
  8. We have found a divisor d. So we factor out d, add the prime factors of d to the list of prime factors of x, and go to step 4.

All the parameters of this algorithm are near the beginning of the program listing. I looked for 1024-bit random numbers, with a timeout of 250 milliseconds, and I keep running the program until I get a number x with at least 4 prime factors (sometimes the program finds a number with 1, 2, or 3 prime factors first). With this parameter set, it usually takes about 15-20 seconds on my 2.66Ghz iMac.

Pollard's rho algorithm isn't really that efficient, but it's simple, compared to the quadratic sieve (QS) or the general number field sieve (GNFS) -- I just wanted to see how the simple algorithm worked.


Why this works: (despite the claim of many of you that this is a hard problem)

The plain fact of it is, that prime numbers aren't that rare. For 1024-bit numbers, the Prime Number Theorem says that about 1 in every 1024 ln 2 (= about 710)
numbers is prime.

So if I generate a random number x that is prime, and I accept probabilistic prime detection, I've successfully factored x.

If it's not prime, but I quickly factor out a few small factors, and the remaining factor is (probabilistically) prime, then I've successfully factored x.

Otherwise I just give up and generate a new random number. (which the OP says is acceptible)

Most of the numbers successfully factored will have 1 large prime factor and a few small prime factors.

The numbers that are hard to factor are the ones that have no small prime factors and at least 2 large prime factors (these include cryptographic keys that are the product of two large numbers; the OP has said nothing about cryptography), and I can just skip them when I run out of time.

package com.example;

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class FindLargeRandomComposite {
    final static private int[] smallPrimes = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 
        31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 
        73, 79, 83, 89, 97};

    final static private int maxTime = 250;
    final static private int Nbits = 1024;
    final static private int minFactors = 4;
    final static private int NCERTAINTY = 4096;

    private interface Predicate { public boolean isTrue(); }

    static public void main(String[] args)
    {
        Random r = new Random();
        boolean found = false;
        BigInteger x=null;
        List<BigInteger> factors=null;
        long startTime = System.currentTimeMillis();
        while (!found)
        {
            x = new BigInteger(Nbits, r);
            factors = new ArrayList<BigInteger>();
            Predicate keepRunning = new Predicate() {
                final private long stopTime = System.currentTimeMillis() + maxTime;
                public boolean isTrue() {
                    return System.currentTimeMillis() < stopTime;
                }
            };
            found = factor(x, factors, keepRunning);
            System.out.println((found?(factors.size()+" factors "):"not factored ")+x+"= product: "+factors);
            if (factors.size() < minFactors)
                found = false;
        }
        long stopTime = System.currentTimeMillis();
        System.out.println("Product verification: "+(x.equals(product(factors))?"passed":"failed"));
        System.out.println("elapsed time: "+(stopTime-startTime)+" msec");
    }

    private static BigInteger product(List<BigInteger> factors) {
        BigInteger result = BigInteger.ONE;
        for (BigInteger f : factors)
            result = result.multiply(f);
        return result;
    }

    private static BigInteger findFactor(BigInteger x, List<BigInteger> factors,
            BigInteger divisor)
    {
        BigInteger[] qr = x.divideAndRemainder(divisor);
        if (qr[1].equals(BigInteger.ZERO))
        {
            factors.add(divisor);
            return qr[0];
        }
        else
            return x;
    }

    private static BigInteger findRepeatedFactor(BigInteger x,
            List<BigInteger> factors, BigInteger p) {
        BigInteger xprev = null;
        while (xprev != x)
        {
            xprev = x;
            x = findFactor(x, factors, p);
        }
        return x;
    }

    private static BigInteger f(BigInteger x, BigInteger n)
    {
        return x.multiply(x).add(BigInteger.ONE).mod(n);
    }

    private static BigInteger gcd(BigInteger a, BigInteger b) {
        while (!b.equals(BigInteger.ZERO))
        {
            BigInteger nextb = a.mod(b);
            a = b;
            b = nextb;
        }
        return a;
    }
    private static BigInteger tryPollardRho(BigInteger n,
            List<BigInteger> factors, Predicate keepRunning) {
        BigInteger x = new BigInteger("2");
        BigInteger y = x;
        BigInteger d = BigInteger.ONE;
        while (d.equals(BigInteger.ONE) && keepRunning.isTrue())
        {
            x = f(x,n);
            y = f(f(y,n),n);
            d = gcd(x.subtract(y).abs(), n);
        }
        if (d.equals(n))
            return x;
        BigInteger[] qr = n.divideAndRemainder(d);
        if (!qr[1].equals(BigInteger.ZERO))
            throw new IllegalStateException("Huh?");
        // d is a factor of x. But it may not be prime, so run it through the factoring algorithm.
        factor(d, factors, keepRunning);
        return qr[0];
    }

    private static boolean factor(BigInteger x0, List<BigInteger> factors,
            Predicate keepRunning) {

        BigInteger x = x0;
        for (int p0 : smallPrimes)
        {
            BigInteger p = new BigInteger(Integer.toString(p0));
            x = findRepeatedFactor(x, factors, p);          
        }
        boolean done = false;
        while (!done && keepRunning.isTrue())
        {
            done = x.equals(BigInteger.ONE) || x.isProbablePrime(NCERTAINTY);
            if (!done)
            {
                x = tryPollardRho(x, factors, keepRunning);
            }
        }
        if (!x.equals(BigInteger.ONE))
            factors.add(x);
        return done;
    }
}
在梵高的星空下 2024-10-11 17:11:30

如果您要分解的数字具有较小的质因数,您可以使用 Pollard p-1 分解算法。它分解出了数字 2 ^ 740 + 1 的 30 位质因数。 ECM 是一种类似的次指数算法,但实现起来更加困难。算法的时间量基于边界 b 的设置。它将对任何具有因子 p 的数字进行因式分解,其中 p - 1 是 b 平滑的。

//Pollard p - 1 factorization algorithm

void factor(mpz_t g, mpz_t n, long b)
{
    //sieve for primes
    std::vector<bool> r;

    for(int i = 0; i < b; i++)
        r.push_back(true);


    for(int i = 2; i < ceil(sqrt(b - 1)); i++)
        if(r.at(i) == true)
            for(int j = i * i; j < b; j += i)
                r.at(j) = false;

    std::vector<long> p;
    std::vector<long> a;
    for(int i = 2; i < b; i++)
        if(r[i] == true)
        {
            p.push_back(i);//Append the prime on to the vector
            int temp = floor(log(b) / log(i)); //temp = logb(i)

            // put primes in to sieve
            // a = the maximum power for p ^ a < bound b
            if(temp == 0)
                a.push_back(1);
            else
                a.push_back(temp);                
        }

    int m = p.size();//m = number of primes under bound b

    mpz_t c;// c is the number Which will be exponated
    mpz_init(c);
    long two = 2;
    mpz_set_ui(c, two);// set c to 2

    int z = 0;
    long x = 2;

    // loop c until a factor is found
    for(;;)
    {
    mpz_set_si( c, x);

    //powering ladder
    for(long i = 0; i < m; i++)
        for(long j = 0; j < a[i]; j++)
            mpz_powm_ui(c , c, (p[i]), n);

    //check if a factor has been found;
    mpz_sub_ui(c ,c,1);
    mpz_gcd(g ,c, n);
    mpz_add_ui(c , c, 1);

    //if g is a factor return else increment c
    if((mpz_cmp_si(g,1)) > 0 && (mpz_cmp(g,n)) < 0)
        return;
    else if (x > b)
        break;
    else
        x++;
    }

}


int main()
{
    mpz_t x;
    mpz_t g;

    //intialize g and x
    mpz_init(g);
    mpz_init_set_str(x,"167698757698757868925234234253423534235342655234234235342353423546435347",10);

    //p-1 will factor x as long as it has a factor p where p - 1 is b-smooth(has all prime factors less than bound b)
    factor(g , x, 1000);

    //output the factor, it will output 1 if algorithm fails
    mpz_out_str(NULL, 10, g);

    return 0;
}

输出 - 7465647
执行时间 - 0.003 秒

J.Pollard 创建的另一种因式分解算法是 Pollards Rho 算法,该算法不是那么快,但需要很少的空间。它们也是平行化它的方法。其复杂度为 O(n^1/4)

//Pollard rho factoring algorithm
void rho(mpz_t g, mpz_t n)
{
    mpz_t x;
    mpz_t y;
    mpz_init_set_ui(x ,2);
    mpz_init_set_ui(y ,2);//initialize x and y as 2
    mpz_set_ui(g , 1);
    mpz_t temp;
    mpz_init(temp);

    if(mpz_probab_prime_p(n,25) != 0)
        return;//test if n is prime with miller rabin test

    int count;
    int t1 = 0;
    int t2 = 1;
    int nextTerm = t1 + t2;
    while(mpz_cmp_ui(g,1) < 1)
    {
        f(x,n);//x is changed
        f(y,n);//y is going through the sequence twice as fast
        f(y,n);

        if(count == nextTerm)//calculate gcd every fibonacci number
        {
            mpz_sub(temp,x,y);
            mpz_gcd(g , temp, n);

            t1 = t2;
            t2 = nextTerm;
            nextTerm = t1 + t2;//calculate next fibonacci number
        }

        count ++;
    }

    return;
}

int main()
{
    mpz_t x;
    mpz_t g;

    //intialize g and x
    mpz_init(g);
    mpz_init_set_str(x,"167698757698757868925234234253423",10);


    rho(g , x);

    //output the factor, it will output 1 if algorithm fails
    mpz_out_str(NULL, 10, g);

    return 0;
}

输出 - 353
执行时间 - 0.003s

You could use Pollard p-1 factorization algorithm if the number you want to factor has small prime factors. It has factored out a 30 digit prime factor of the number 2 ^ 740 + 1. ECM is a similar but sub-exponetial algorithm but implementation is more difficult. The amount of time the algorithm is based on what the bound b is set as. It will factor any number which has a factor p where p - 1 is b-smooth.

//Pollard p - 1 factorization algorithm

void factor(mpz_t g, mpz_t n, long b)
{
    //sieve for primes
    std::vector<bool> r;

    for(int i = 0; i < b; i++)
        r.push_back(true);


    for(int i = 2; i < ceil(sqrt(b - 1)); i++)
        if(r.at(i) == true)
            for(int j = i * i; j < b; j += i)
                r.at(j) = false;

    std::vector<long> p;
    std::vector<long> a;
    for(int i = 2; i < b; i++)
        if(r[i] == true)
        {
            p.push_back(i);//Append the prime on to the vector
            int temp = floor(log(b) / log(i)); //temp = logb(i)

            // put primes in to sieve
            // a = the maximum power for p ^ a < bound b
            if(temp == 0)
                a.push_back(1);
            else
                a.push_back(temp);                
        }

    int m = p.size();//m = number of primes under bound b

    mpz_t c;// c is the number Which will be exponated
    mpz_init(c);
    long two = 2;
    mpz_set_ui(c, two);// set c to 2

    int z = 0;
    long x = 2;

    // loop c until a factor is found
    for(;;)
    {
    mpz_set_si( c, x);

    //powering ladder
    for(long i = 0; i < m; i++)
        for(long j = 0; j < a[i]; j++)
            mpz_powm_ui(c , c, (p[i]), n);

    //check if a factor has been found;
    mpz_sub_ui(c ,c,1);
    mpz_gcd(g ,c, n);
    mpz_add_ui(c , c, 1);

    //if g is a factor return else increment c
    if((mpz_cmp_si(g,1)) > 0 && (mpz_cmp(g,n)) < 0)
        return;
    else if (x > b)
        break;
    else
        x++;
    }

}


int main()
{
    mpz_t x;
    mpz_t g;

    //intialize g and x
    mpz_init(g);
    mpz_init_set_str(x,"167698757698757868925234234253423534235342655234234235342353423546435347",10);

    //p-1 will factor x as long as it has a factor p where p - 1 is b-smooth(has all prime factors less than bound b)
    factor(g , x, 1000);

    //output the factor, it will output 1 if algorithm fails
    mpz_out_str(NULL, 10, g);

    return 0;
}

Outputs - 7465647
Execution time - 0.003 seconds

Another Factoring algorithm created by J.Pollard was Pollards Rho algorithm which is not that quick but requires very little space. Their are also ways to parrelize it. Its complexity is O(n^1/4)

//Pollard rho factoring algorithm
void rho(mpz_t g, mpz_t n)
{
    mpz_t x;
    mpz_t y;
    mpz_init_set_ui(x ,2);
    mpz_init_set_ui(y ,2);//initialize x and y as 2
    mpz_set_ui(g , 1);
    mpz_t temp;
    mpz_init(temp);

    if(mpz_probab_prime_p(n,25) != 0)
        return;//test if n is prime with miller rabin test

    int count;
    int t1 = 0;
    int t2 = 1;
    int nextTerm = t1 + t2;
    while(mpz_cmp_ui(g,1) < 1)
    {
        f(x,n);//x is changed
        f(y,n);//y is going through the sequence twice as fast
        f(y,n);

        if(count == nextTerm)//calculate gcd every fibonacci number
        {
            mpz_sub(temp,x,y);
            mpz_gcd(g , temp, n);

            t1 = t2;
            t2 = nextTerm;
            nextTerm = t1 + t2;//calculate next fibonacci number
        }

        count ++;
    }

    return;
}

int main()
{
    mpz_t x;
    mpz_t g;

    //intialize g and x
    mpz_init(g);
    mpz_init_set_str(x,"167698757698757868925234234253423",10);


    rho(g , x);

    //output the factor, it will output 1 if algorithm fails
    mpz_out_str(NULL, 10, g);

    return 0;
}

Outputs - 353
Execution time - 0.003s

风情万种。 2024-10-11 17:11:30

你的任务很有趣!谢谢!

我很高兴花了几乎整整两天的时间来编写非常先进的解决方案。我从头开始实现了三种分解算法:Trial DivisionPollard 的 RhoLenstra 椭圆曲线法 (ECM)

众所周知,ECM 方法(使用椭圆曲线)是处理中等范围因子最快的方法之一。虽然试除法适用于非常小的因子(每秒最多 2^20 个因子),但 Pollard Rho 适合较大但较小的因子(每秒最多 2^40 个因子),而 ECM 则适用于中等范围的因子(最多 2^40 个因子)。到每 10 秒 2^60)。

还有一些非常高级的方法,例如 通用数域筛选 (GNFS)(因子高达 2^每月700),但实施起来非常困难。另外 Quadratic Sieve 方法是先进的(可能每月最多 2^400),我也实现了这是从头开始的,但它有非常大的代码,但易于理解,但由于其大小,我不将其附在此处。 ECM方法是先进方法中唯一比较容易实现的方法。

除了我实现的上述 3 种因式分解方法之外,我还在代码中使用了以下算法: 模幂费马素性测试Barrett 约简, 欧几里得算法扩展欧几里得算法模乘逆, 椭圆曲线点加法和乘法

实际上,您的任务很容易快速解决:众所周知,对于位大小 2048,大约每个 ln(2048) = 1420 数字都会出现一次素数,因此您只需快速生成大约 1500检查数字是否为素数,例如使用 费马素性测试,速度非常快。如果一个数是素数,那么根据定义,它已经被分解了。

我在脑海中进一步扩展了你的任务,让它变得更有趣。我不搜索素数,而是尝试找到这样的 2048 位随机生成的数字,使得它们至少有几个大素数因子。我将这种数字称为“有趣”。当然,如果一个数字有几个微小的因数和一个大素数,那么它就没那么有趣了。但如果它有 60 位质因数,那么捕获这样的数字会很有趣,这就是我在代码中所做的。

您可以在我的代码中看到我将其用于两种库, 增强多精度GMP。两者都包含在我的代码中(请参阅 #include#include),因此您应该安装并链接两者。在 Linux 下,通过 sudo apt install libboost-all-dev libgmp-dev 非常容易安装。但是 Windows 有点棘手,先安装 Chocolatey Packet Manager 然后执行命令 choco install boost-msvc- 14.3。对于 GMP,按照此处所述安装 VCPKG,然后vcpkg install gmp.如果您愿意,也可以通过 VCPKG 安装 Boost:vcpkg install boost

ECM(椭圆曲线)方法非常有趣且简单:

  1. 您生成许多随机曲线,具有随机 X、Y、A、B、N 参数,其中 N 是需要分解的输入数字,其他参数是拟合曲线方程Y^2 = X^3 + A * x + B (mod N)的随机数。

  2. 将每条曲线乘以所有增长的素数(具有较小的幂)。

  3. 在某些时候,您将获得第一条曲线的多个曲线阶数,并且在这种情况下,通过曲线阶数的属性,您将获得所谓的曲线上的无限点。

  4. 如果您查看点加法公式,那么您可能会发现分母为公式 lambda = (y_q - y_p) / (x_q - x_p)。该分母计算为模乘逆模数N。对于无限点,它变得不可逆,并且根据反数的性质,只有当 GCD(N, x_q - x_p ) != 1,在这种情况下,GCD 给出了一些重要的因子(有时也是 N),因此我们的 N 通过给出第一个因子成功分解了因子。

  5. 如果我们没有得到无限点,那么我们将继续生成更多曲线并除以更多(越来越大)的素数。我们生成的曲线越多,相乘的素数越多,因式分解的成功率就越高。

在线尝试!

源代码在这里。由于 StackOverflow 每个帖子的符号数限制为 30K,而我的代码大约有 44K 字节,因此我无法在此处内联它,而是通过 Github Gist(下面的链接)共享它。上面的代码也可以通过 GodBolt 服务器上的 Try it online! 链接获得。

GitHub Gist 代码

示例控制台输出:

TrialDiv time 8.230 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime), 
UnFactored: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24, composite), 
PollardRho time 8.51 sec
Num: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24)
Factored: 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime), 
UnFactored: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09, composite), 
Curves   1, Ops   12.965 Ki, ExtraPrimes 512, Primes  0.500 Ki, IterTime   0.410 sec
Curves   2, Ops   50.912 Ki, ExtraPrimes 512, Primes  1.000 Ki, IterTime   8.062 sec
Curves   3, Ops  112.586 Ki, ExtraPrimes 464, Primes  1.453 Ki, IterTime  15.093 sec
Curves   4, Ops  162.931 Ki, ExtraPrimes 120, Primes  1.570 Ki, IterTime   4.853 sec
Curves   5, Ops  193.699 Ki, ExtraPrimes  80, Primes  1.648 Ki, IterTime   4.201 sec
ECM time 32.624 sec
Num: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09)
Factored: 955928964443 (2^39.80, prime), 
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite), 
Final time 49.385 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime), 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime), 955928964443 (2^39.80, prime), 
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite), 

Interesting task you have! Thanks!

It was a pleasure for me to spend two almost whole days to write very advanced solution. I implemented from scratch three factorization algorithms: Trial Division, Pollard's Rho, Lenstra Elliptic Curve Method (ECM).

It is well known that ECM method (with elliptic curves) is one of the fastest methods for mid-range factors. While trial division is good for very small factors (up to 2^20 factor per second), Pollard Rho is good for bigger yet small factors (up to 2^40 per second), while ECM is good for mid-range factors (up to 2^60 per 10 seconds).

There are also very advanced methods like General Number Field Sieve (GNFS) (factors up to 2^700 per month), but they are very difficult to implement. Also Quadratic Sieve method is advanced (probably up to 2^400 per month), I also implemented this from sratch but it has very big code, yet manageable to understand, but due to its size I don't attach it here. ECM method was the only method quite easy to implement among advanced methods.

Besides mentioned above 3 methods of factorization that I implemented, I also used following algorithms inside code: Modular Exponentiation, Fermat Primality Test, Barrett Reduction, Euclidean Algorithm, Extended Euclidean Algorithm, Modular Multiplicative Inverse, Elliptic Curve Point Addition and Multiplication.

Actually your task as it is is very easy to solve fast: it is known that for bit size 2048 there appears a prime once approximately every ln(2048) = 1420 number, so you just generate fast around 1500 numbers while checking if they are prime, for example using Fermat Primality Test which is very fast. And if a number is prime then by definition it is already factored as it is.

I extended in my mind your task further, to make it more interesting. I don't search for prime number, but instead trying to find such 2048-bit random-generated numbers such that they have at least several big prime factors. This kind of numbers I will call "interesting". Of course if a number has several tiny factors and one large prime then it is not that interesting. But if it has 60-bit prime factor then it is interesting to catch such number, that's what I do in my code.

You can see in my code that I adopted it for two kinds of libraries, Boost Multiprecision and GMP. Both are included in my code (see #include <boost/multiprecision/cpp_int.hpp> and #include <gmpxx.h>), so you should install and link both. Under Linux it is very easy to install both through sudo apt install libboost-all-dev libgmp-dev. But Windows is a bit tricky, install Chocolatey Packet Manager first and then do command choco install boost-msvc-14.3. And for GMP install VCPKG as described here and then vcpkg install gmp. If you want you may install Boost through VCPKG too: vcpkg install boost.

ECM (elliptic curve) method is very interesting and simple:

  1. You generate many random curves, with random X, Y, A, B, N params, where N is your input number that needs to be factored, and other params are random that fit curve equation Y^2 = X^3 + A * x + B (mod N).

  2. You multiply each curve by all growing prime numbers (with some small power).

  3. At some point you'll get a multiple of curve order for some first curve, and by a property of curve order in such a case you'll get so-called Infinite point on curve.

  4. If you look at Point Addition formula then you may see that there is a denominator in formula lambda = (y_q - y_p) / (x_q - x_p). This denominator is computed as Modular Multiplicative Inverse modulus N. For Infinite point it becomes non-invertible, and by property of inverse number non-invertibility is only possible when GCD(N, x_q - x_p) != 1, in which case GCD gives some non-trivial factor (sometimes also N), hence our N is factored successfully by giving first factor.

  5. If we don't get Infinite point, then we continue to generate more curves and to divide by more (bigger and bigger) prime numbers. More curves we generate and more primes we multiply by, the higher is success of factorization.

Try it online!

SOURCE CODE HERE. As StackOverflow has limit of 30K symbols per post and my code alone is about 44K bytes, I couldn't inline it here, but instead sharing it through Github Gist (link below). Also same code is above available through Try it online! link on GodBolt server.

GitHub Gist code

Example console output:

TrialDiv time 8.230 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime), 
UnFactored: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24, composite), 
PollardRho time 8.51 sec
Num: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24)
Factored: 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime), 
UnFactored: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09, composite), 
Curves   1, Ops   12.965 Ki, ExtraPrimes 512, Primes  0.500 Ki, IterTime   0.410 sec
Curves   2, Ops   50.912 Ki, ExtraPrimes 512, Primes  1.000 Ki, IterTime   8.062 sec
Curves   3, Ops  112.586 Ki, ExtraPrimes 464, Primes  1.453 Ki, IterTime  15.093 sec
Curves   4, Ops  162.931 Ki, ExtraPrimes 120, Primes  1.570 Ki, IterTime   4.853 sec
Curves   5, Ops  193.699 Ki, ExtraPrimes  80, Primes  1.648 Ki, IterTime   4.201 sec
ECM time 32.624 sec
Num: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09)
Factored: 955928964443 (2^39.80, prime), 
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite), 
Final time 49.385 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime), 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime), 955928964443 (2^39.80, prime), 
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite), 
烟凡古楼 2024-10-11 17:11:30

目前您无法将 bigint 与 GMP 因式分解。您可以将 bigint 转换为其他库并使用它们的因式分解算法。请注意,≥20 位整数的因式分解需要专门的算法,并且速度几乎呈指数级缓慢。

查看:

At the moment you cannot factor a bigint with GMP. You can convert your bigint to other libraries and use their factoring algorithms. Note that factoring of integers with >>20 digits needs specialized algorithms and is near exponentially slow.

Check out:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文