为什么遗传算法不能解决 RSA 分解等问题?
前段时间我对 GA 非常感兴趣,并且对它们进行了很多研究。我使用 C++ GAlib 编写了一些程序,我对它们在几秒钟内解决其他难以计算的问题的能力感到非常惊讶。它们似乎是一种很棒的暴力破解技术,非常聪明且适应性强。
我正在读 Michalewitz 的一本书,如果我没记错书名的话,它似乎都是基于麻省理工学院证明的图式定理。
我还听说它不能真正用于解决诸如分解 RSA 私钥之类的问题。
有人可以解释为什么会出现这种情况吗?
Some time ago i was pretty interested in GAs and i studied about them quite a bit. I used C++ GAlib to write some programs and i was quite amazed by their ability to solve otherwise difficult to compute problems, in a matter of seconds. They seemed like a great bruteforcing technique that works really really smart and adapts.
I was reading a book by Michalewitz, if i remember the name correctly and it all seemed to be based on the Schema Theorem, proved by MIT.
I've also heard that it cannot really be used to approach problems like factoring RSA private keys.
Could anybody explain why this is the case ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
遗传算法根本不聪明,它们是非常贪婪的优化算法。他们都围绕同一个想法工作。您有一组点(“一群个体”),然后使用随机运算符将该组转换为另一个点,并偏向最佳改进的方向(“突变+交叉+选择”)。重复直到它收敛或者你厌倦了它,那里没有什么聪明的地方。
为了使遗传算法发挥作用,新的点群的性能应接近之前的点群。小小的扰动应该不会产生什么变化。如果在对一个点进行小扰动之后,您获得了一个代表性能完全不同的解决方案的点,那么该算法并不比随机搜索更好,而随机搜索通常是一种不好的优化算法。在 RSA 的情况下,如果你的点直接是数字,那么它要么是,要么不是,只需翻转一下......因此,如果你代表 RSA 问题而不需要太多思考,那么使用遗传算法并不比随机搜索更好“让我们代码搜索点作为数字的位"
Genetic Algorithm are not smart at all, they are very greedy optimizer algorithms. They all work around the same idea. You have a group of points ('a population of individuals'), and you transform that group into another one with stochastic operator, with a bias in the direction of best improvement ('mutation + crossover + selection'). Repeat until it converges or you are tired of it, nothing smart there.
For a Genetic Algorithm to work, a new population of points should perform close to the previous population of points. Little perturbation should creates little change. If, after a small perturbation of a point, you obtain a point that represents a solution with completely different performance, then, the algorithm is nothing better than random search, a usually not good optimization algorithm. In the RSA case, if your points are directly the numbers, it's either YES or NO, just by flipping a bit... Thus using a Genetic Algorithm is no better than random search, if you represents the RSA problem without much thinking "let's code search points as the bits of the numbers"
我会这么说,因为键的因式分解不是优化问题,而是精确问题。这种区分不是很准确,这里详细说明。
遗传算法非常适合解决最小值(局部/全局)的问题,但在因式分解问题中没有最小值。遗传算法如 DCA 或模拟退火需要衡量“我与解决方案的接近程度”,但你不能对我们的问题这么说。
举个遗传学良好的问题的例子,有爬山问题。
I would say because factorisation of keys is not an optimisation problem, but an exact problem. This distinction is not very accurate, so here are details.
Genetic algorithms are great to solve problems where the are minimums (local/global), but there aren't any in the factorising problem. Genetic algorithm as DCA or Simulated annealing needs a measure of "how close I am to the solution" but you can't say this for our problem.
For an example of problem genetics are good, there is the hill climbing problem.
遗传算法基于候选解决方案的适应性评估。
基本上,您有一个适应度函数,它将候选解决方案作为输入,并返回一个标量,告诉您该候选解决方案有多好。然后,您继续允许给定一代中最好的个体以比其他个体更高的概率进行交配,以便后代(希望)整体上更加“适应”,等等。
在 RSA 分解场景中,无法评估适应性(候选解决方案与其他解决方案相比有多好),因此您不能使用它们。
GAs are based on fitness evaluation of candidate solutions.
You basically have a fitness function that takes in a candidate solution as input and gives you back a scalar telling you how good that candidate is. You then go on and allow the best individuals of a given generation to mate with higher probability than the rest, so that the offspring will be (hopefully) more 'fit' overall, and so on.
There is no way to evaluate fitness (how good is a candidate solution compared to the rest) in the RSA factorization scenario, so that's why you can't use them.
遗传算法不是暴力破解,它们只是一种搜索算法。每个 GA 本质上都是这样的:
其中
good_enough
和best_of
是根据适应度函数定义的。适应度函数表示给定候选人解决问题的能力。这似乎是这里的核心问题:如何编写因式分解的适应度函数?例如 20 = 2*10 或 4*5。元组 (2,10) 和 (4,5) 显然是赢家,但是其他元组呢? (1,9) 或 (3,4) 的“拟合度”如何?GAs are not brute-forcing, they’re just a search algorithm. Each GA essentially looks like this:
Where
good_enough
andbest_of
are defined in terms of a fitness function. A fitness function says how well a given candidate solves the problem. That seems to be the core issue here: how would you write a fitness function for factorization? For example 20 = 2*10 or 4*5. The tuples (2,10) and (4,5) are clearly winners, but what about the others? How “fit” is (1,9) or (3,4)?间接地,您可以使用遗传算法对整数 N 进行因式分解。Dixon 的整数因式分解方法使用涉及前 k 个素数的幂(以 N 为模)的方程。这些幂的乘积小素数被称为“平滑”。如果我们使用前k=4素数 - {2,3,5,7} - 42=2x3x7 是平滑的,而 11 则不是(由于缺乏更好的术语,11 是“粗糙”) )。 Dixon 方法需要一个由定义这些平滑数的指数组成的可逆 k x k 矩阵。有关 Dixon 方法的更多信息,请参阅 https://en.wikipedia.org/wiki/Dixon%27s_factorization_method< /a>.
现在,回到最初的问题:有一种遗传算法可以为迪克森方法找到方程。
每次组合两对时 - 比如说 [v,w] 和 [x,y] - 四个数字的平滑部分都会被消除,除了 v 和 x 的平滑部分共享的因子以及 的平滑部分的因子之外w 和 y 共享。所以我们选择最大程度共享光滑部分的父母。为了使这个精确,写
g = gcd(v 的平滑部分,x 的平滑部分)
h = gcd(w 的平滑部分,y 的平滑部分)
[v,w], [x,y] = [gv/g ,hw/h],[gx/g,hy/h]。
来之不易的平滑因子 g 和 h 将保留到下一代中,但 v/g、w/h、x/g 和 y/h 的平滑部分将被牺牲,以便将 [v,w] 和[x,y]。因此,我们选择 v/g、w/h、x/g 和 y/h 具有最小平滑部分的父级。通过这种方式,我们确实将 rx = sy mod N 解决方案的粗糙部分从一代推向下一代。
Indirectly, you can use a genetic algorithm to factor an integer N. Dixon's integer factorization method uses equations involving powers of the first k primes, modulo N. These products of powers of small primes are called "smooth". If we are using the first k=4 primes - {2,3,5,7} - 42=2x3x7 is smooth and 11 is not (for lack of a better term, 11 is "rough"). Dixon's method requires an invertible k x k matrix consisting of the exponents that define these smooth numbers. For more on Dixon's method see https://en.wikipedia.org/wiki/Dixon%27s_factorization_method.
Now, back to the original question: There is a genetic algorithm for finding equations for Dixon's method.
Every time you combine two pairs - say [v,w] and [x,y] - the smooth parts of the four numbers are obliterated, except for the factors the smooth parts of v and x share, and the factors the smooth parts of w and y share. So we choose parents that share smooth parts to the greatest possible extent. To make this precise, write
g = gcd(smooth part of v, smooth part of x)
h = gcd(smooth part of w, smooth part of y)
[v,w], [x,y] = [g v/g, h w/h], [g x/g, h y/h].
The hard-won smooth factors g and h will be preserved into the next generation, but the smooth parts of v/g, w/h, x/g and y/h will be sacrificed in order to combine [v,w] and [x,y]. So we choose parents for which v/g, w/h, x/g and y/h have the smallest smooth parts. In this way we really do drive down the rough parts of our solutions to rx = sy mod N from one generation to the next.
进一步思考,获得网格 ax = by mod N 中的平滑系数 x, y 的最佳方法是回归,而不是遗传算法。
执行两个回归,一个回归由来自随机选择的 ax = by mod N 解的 x 值组成的响应向量 R0 组成;另一个具有由相同解的 y 值组成的响应向量 R1。两个回归都使用相同的解释矩阵 X。X 中的列由 x 值模平滑除数的余数组成,其他列由 y 值模其他平滑除数的余数组成。
平滑除数的最佳选择是使每次回归的误差最小化的除数:
E0 = R0 - X ((X-转置)(X)) (X-转置) (R0)
E1 = R1 - X ((X-转置)(X) 的逆(X-转置)(X)) (X-转置) (R1)
接下来是消除 X 的行运算。然后将这些行运算的结果 z 应用于原始解中的 x 和 y 值由其形成X。
类似地,z R1 = z E1
z R0 和 z R1 现在组合了三个属性:
大平滑数的相对较小的倍数可能就是平滑数本身。通过 mod N 平滑求解 ax = 会产生 Dixon 方法的输入。
两项优化使其速度特别快:
On further thought the best way to make your way towards smooth coefficients x, y in the lattice ax = by mod N is with regression, not a genetic algorithm.
Two regressions are performed, one with response vector R0 consisting of x-values from randomly chosen solutions of ax = by mod N; and the other with response vector R1 consisting of y-values from the same solutions. Both regressions use the same explanatory matrix X. In X are columns consisting of the remainders of the x-values modulo smooth divisors, and other columns consisting of the remainders of the y-values modulo other smooth divisors.
The best choice of smooth divisors is the one that minimizes the errors from each regression:
E0 = R0 - X (inverse of (X-transpose)(X)) (X-transpose) (R0)
E1 = R1 - X (inverse of (X-transpose)(X)) (X-transpose) (R1)
What follows is row operations to annihilate X. Then apply a result z of these row operations to the x- and y-values from the original solutions from which X was formed.
Similarly, z R1 = z E1
Three properties are now combined in z R0 and z R1:
A relatively small multiple of a large smooth number might just be the smooth number itself. Having a smooth solution of ax = by mod N yields an input to Dixon's method.
Two optimizations make this particularly fast: