遗传算法中交叉的效率
我已经实现了许多遗传算法来解决各种问题。然而,我仍然对交叉/重组的有用性持怀疑态度。
我通常在实施交叉之前首先实施变异。在我实施交叉之后,与简单地使用突变并在每一代中引入一些随机个体以确保遗传相比,我通常不会看到生成良好候选解决方案的速度显着加快。
当然,这可能归因于交叉函数和/或概率的选择不当,但我想得到一些具体的解释/证据来说明为什么/是否交叉可以改善遗传算法。有这方面的研究吗?
我明白其中的道理:交叉可以将两个人的优点结合到一个人身上。但对我来说,这就像说我们可以将科学家和美洲虎交配以获得聪明而快速的混合体。
编辑:在 mcdowella 的回答中,他提到如何找到一种交叉可以改进从多个起点爬山的案例并非易事。有人可以详细说明这一点吗?
I've implemented a number of genetic algorithms to solve a variety of a problems. However I'm still skeptical of the usefulness of crossover/recombination.
I usually first implement mutation before implementing crossover. And after I implement crossover, I don't typically see a significant speed-up in the rate at which a good candidate solution is generated compared to simply using mutation and introducing a few random individuals in each generation to ensure genetic .
Of course, this may be attributed to poor choices of the crossover function and/or the probabilities, but I'd like to get some concrete explanation/evidence as to why/whether or not crossover improves GAs. Have there been any studies regarding this?
I understand the reasoning behind it: crossover allows the strengths of two individuals to be combined into one individual. But to me that's like saying we can mate a scientist and a jaguar to get a smart and fast hybrid.
EDIT: In mcdowella's answer, he mentioned how finding a case where cross-over can improve upon hill-climbing from multiple start points is non-trivial. Could someone elaborate upon this point?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
它强烈取决于您的搜索空间的平滑度。反常的例子是,如果每个“基因组”在用于生成“现象”之前都经过哈希处理,那么您只是在进行随机搜索。
不太极端的情况,这就是为什么我们经常在 GA 中对整数进行格雷编码。
您需要根据编码定制交叉和变异函数。如果你对 GA 进行无情的计算,那么 GA 很容易衰减。如果 A 和 B 的交叉不能产生既类似于 A 又类似于 B 的东西,那么它就没用。
示例:
基因组有 3 位长,第 0 位决定它是陆地生物还是海洋生物。第 1-2 位描述了陆地生物的消化功能和海洋生物的视觉能力。
考虑两种陆地生物。
它们可能会在第 1 位和第 2 位之间交叉,产生一个消化功能介于妈妈和爸爸之间的消化功能的孩子。伟大的。
只要位 0 没有改变,这种交叉似乎是明智的。如果是的话,那么你的交叉函数已经将某种胆量变成了某种眼睛。呃……呃?这也可能是随机突变。
这就引出了 DNA 如何解决这个问题的问题。嗯,它既是模式的又是层次的。有一些大的部分可能会发生很大的变化而没有太大的影响,而在其他部分中,单个突变可能会产生巨大的影响(如上面的位 0)。有时,X 的值会影响 Y 引发的行为,并且 X 的所有值都是合法的并且可以探索,而对 Y 的修改会使动物出现段错误。
遗传算法的理论分析通常使用极其粗糙的编码,并且它们受到数值问题的困扰,而不是语义问题。
It strongly depends on the smoothness of your search space. Perverse example if every "geneome" was hashed before being used to generate "phenomes" then you would just be doing random search.
Less extreme case, this is why we often gray-code integers in GAs.
You need to tailor your crossover and mutation functions to the encoding. GAs decay quite easily if you throw unsympathetic calculations at them. If the crossover of A and B doesn't yield something that's both A-like and B-like then it's useless.
Example:
The genome is 3 bits long, bit 0 determines whether it's land-dwelling or sea-dwelling. Bits 1-2 describe digestive functions for land-dwelling creatures and visual capabilities for sea-dwelling creatures.
Consider two land-dwelling creatures.
They might crossover between bits 1 and 2 yielding a child whose digestive function is some compromise between Mum's and Dad's. Great.
This crossover seems sensible provided that bit 0 hasn't changed. If is does then your crossover function has turned some kind of guts into some kind of eyes. Er... Wut? It might as well have been a random mutations.
Begs the question how DNA gets around this problem. Well, it's both modal and hierarchial. There are large sections which can change a lot without much effect, in others a single mutation can have drastic effects (like bit 0 above). Sometimes the value of X affects the behaviour tiggered by Y, and all values of X are legal and can be explored whereas a modification to Y makes the animal segfault.
Theoretical analyses of GAs often use extremely crude encodings and they suffer more from numerical issues than semantic ones.
您对交叉操作的怀疑是正确的。有一篇论文叫
“关于模拟进化优化中交叉的有效性”(Fogel 和 Stayton,Biosystems 1994)。可在 免费获取1.。
顺便说一句,如果您还没有,我建议您研究一种称为“差异进化”的技术。它可以非常擅长解决许多优化问题。
You are correct in being skeptical about the cross-over operation. There is a paper called
"On the effectiveness of crossover in simulated evolutionary optimization" (Fogel and Stayton, Biosystems 1994). It is available for free at 1.
By the way, if you haven't already I recommend looking into a technique called "Differential Evolution". It can be very good at solving many optimization problems.
我的印象是,从多个随机开始爬山是非常有效的,但试图找到交叉可以改进这一点的情况并非易事。一个参考文献是 David Icl˘anzan 的《Crossover: The Heaven Afflatus in Search》,其中指出
相关论文是“Overcoming Hierarchical Difficulty by Hill-Climbing the
David Iclănzan 和 Dan Dumitrescu 的《积木结构》,其中指出
My impression is that hill-climbing from multiple random starts is very effective, but that trying to find a case where cross-over can improve on this is non-trivial. One reference is "Crossover: The Divine Afflatus in Search" by David Icl˘anzan, which states
A related paper is "Overcoming Hierarchical Difficulty by Hill-Climbing the
Building Block Structure" by David Iclănzan and Dan Dumitrescu, which states
约翰·霍兰德的两部开创性著作《自然和人工系统的适应》和《隐藏秩序》(不太正式)深入讨论了交叉理论。在我看来,戈德堡的《搜索、优化和机器学习中的遗传算法》中有一个关于数学基础的非常平易近人的章节,其中包括以下结论:
另一个很好的参考文献可能是 Ankenbrandt 的“收敛理论的扩展和遗传算法时间复杂性的证明”(罗林斯的“遗传算法基础”)。
我很惊讶你在你的作品中并没有体现出跨界的力量;当我开始使用遗传算法并看到“定向”交叉似乎有多么强大时,我觉得我获得了对进化的洞察力,颠覆了我在学校所学到的知识。所有关于“突变如何导致这个或那个?”的问题。而“嗯,经过这么多代人的努力……”似乎从根本上被误导了。
John Holland's two seminal works "Adaptation in Natural and Artificial Systems" and "Hidden Order" (less formal) discuss the theory of crossover in depth. IMO, Goldberg's "Genetic Algorithms in Search, Optimization, and Machine Learning" has a very approachable chapter on mathematical foundations which includes such conclusions as:
Another good reference might be Ankenbrandt's "An Extension to the Theory of Convergence and a Proof of the Time Complexity of Genetic Algorithms" (in "Foundations of Genetic Algorithms" by Rawlins).
I'm surprised that the power of crossover has not been apparent to you in your work; when I began using genetic algorithms and saw how powerfully "directed" crossover seemed, I felt I gained an insight into evolution that overturned what I had been taught in school. All the questions about "how could mutation lead to this and that?" and "Well, over the course of so many generations..." came to seem fundamentally misguided.
交叉和变异!!其实两者都是必要的。
交叉是一种探索性算子,而变异是一种利用性算子。考虑到解的结构、问题和优化率的可能性,选择正确的 Pc 和 Pm(交叉和变异概率)值非常重要。
检查这个GA-TSP-Solver,它使用了很多交叉和变异方法。您可以以给定的概率测试任何交叉和突变。
The crossover and mutation!! Actually both of them are necessary.
Crossover is an explorative operator, but the mutation is an exploitive one. Considering the structure of solutions, problem, and the likelihood of optimization rate, its very important to select a correct value for Pc and Pm (probability of crossover and mutation).
Check this GA-TSP-Solver, it uses many crossover and mutation methods. You can test any crossover alongside mutations with given probabilities.
这主要取决于搜索空间和您使用的交叉类型。对于某些问题,我发现在开始时使用交叉然后变异,它将加快寻找解决方案的过程,但这不是很好的方法,因为我最终会找到类似的解决方案。如果我们同时使用交叉和变异,我通常会得到更好的优化解决方案。然而,对于某些问题,交叉可能具有很大的破坏性。
此外,仅靠遗传算子还不足以解决大型/复杂的问题。当你的操作员没有改进你的解决方案时(因此当他们没有增加适应度的价值时),你应该开始考虑其他解决方案,例如增量进化等。
it mainly depends on the search space and the type of crossover you are using. For some problems I found that using crossover at the beginning and then mutation, it will speed up the process for finding a solution, however this is not very good approach since I will end up on finding similar solutions. If we use both crossover and mutation I usually get better optimized solutions. However for some problems crossover can be very destructive.
Also genetic operators alone are not enough to solve large/complex problems. When your operators don't improve your solution (so when they don't increase the value of fitness), you should start considering other solutions such as incremental evolution, etc..