缺乏多样化,真的是遗传算法的缺点吗?
我们知道遗传算法(或进化计算)使用解空间 Ω 中的点编码,而不是直接对这些点进行编码。在文献中,我们经常发现GA有一个缺点:(1)由于许多染色体被编码成相似的Ω点或者相似的染色体有非常不同的点,所以效率相当低。您认为这真的是一个缺点吗?因为这类算法在每次迭代中使用变异算子来使候选解决方案多样化。为了增加更多的多样化,我们只需增加交叉的概率。我们不能忘记我们的初始种群(染色体)是随机生成的(另一个更加多样化的现象)。问题是,如果您认为(1)是 GA 的缺点,您能否提供更多详细信息?谢谢。
We know that Genetic Algorithms (or evolutionary computation) work with an encoding of the points in our solution space Ω rather than these points directly. In the literature, we often find that GAs have the drawback : (1) since many chromosomes are coded into a similar point of Ω or similar chromosomes have very different points, the efficiency is quite low. Do you think that is really a drawback ? because these kind of algorithms uses the mutation operator in each iteration to diversify the candidate solutions. To add more diversivication we simply increase the probability of crossover. And we mustn't forget that our initial population ( of chromosones ) is randomly generated ( another more diversification). The question is, if you think that (1) is a drawback of GAs, can you provide more details ? Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
变异和随机初始化不足以解决遗传算法的主要问题“遗传漂移”。遗传漂变意味着遗传算法可能会很快失去大部分遗传多样性,并且搜索会以不利于交叉的方式进行。这是因为随机初始种群很快收敛。变异是另一回事,如果它很高,它就会多样化,这是事实,但同时它会阻止收敛,并且解将以更高的概率与最优解保持一定距离。您需要在搜索过程中调整突变概率(而不是交叉概率)。以类似的方式,类似于 GA 的进化策略在搜索过程中调整突变强度。
我们开发了一种 GA 变体,称为 OffspringSelection GA (OSGA),它在交叉后引入了另一个选择步骤。只有那些超越父母适应度的孩子才会被接受(更好、更差或任何线性插值)。通过这种方式,您甚至可以使用随机亲本选择并将偏差放在后代的质量上。研究表明,这可以减缓遗传漂变。该算法在我们的框架 HeuristicLab 中实现。它具有 GUI,因此您可以下载并尝试解决一些问题。
其他对抗遗传漂变的技术包括利基和拥挤,它们让多样性流入选择中,从而引入另一种但可能不同的偏见。
编辑:我想补充一点,具有相同质量的多个解决方案的情况当然可能会带来问题,因为它会在搜索空间中创建中性区域。不过,我认为你并不是这个意思。主要问题是遗传漂变,即。 (重要)遗传信息的丢失。
Mutation and random initialization are not enough to combat the problem that is known as genetic drift which is the major problem of genetic algorithms. Genetic drift means that the GA may quickly lose most of its genetic diversity and the search proceeds in a way that is not beneficial for crossover. This is because the random initial population quickly converges. Mutation is a different thing, if it is high it will diversify, true, but at the same time it will prevent convergence and the solutions will remain at a certain distance to the optimum with higher probability. You will need to adapt the mutation probability (not the crossover probability) during the search. In a similar manner the Evolution Strategy, which is similar to a GA, adapts the mutation strength during the search.
We have developed a variant of the GA that is called OffspringSelection GA (OSGA) which introduces another selection step after crossover. Only those children will be accepted that surpass their parents' fitness (the better, the worse or any linearly interpolated value). This way you can even use random parent selection and put the bias on the quality of the offspring. It has been shown that this slows the genetic drift. The algorithm is implemented in our framework HeuristicLab. It features a GUI so you can download and try it on some problems.
Other techniques that combat genetic drift are niching and crowding which let the diversity flow into the selection and thus introduce another, but likely different bias.
EDIT: I want to add that the situation of having multiple solutions with equal quality might of course pose a problem as it creates neutral areas in the search space. However, I think you didn't really mean that. The primary problem is genetic drift, ie. the loss of (important) genetic information.
作为旁注,你(OP)说:
这并不总是正确的。个体被编码为基因型,它可以具有任何形状,例如字符串(遗传算法)或实数向量(进化策略)。在评估个体时,即计算其适应度时,每个基因型都会转化为表型。在某些情况下,表型与基因型相同:称为直接编码。否则,编码称为间接。 (您可以在此处找到更多定义(第 2.2.1 节))
直接编码示例:
http://en.wikipedia.org/wiki/Neuroevolution#Direct_and_Indirect_Encoding_of_Networks
间接编码示例:
假设您要优化由长、高和宽定义的长方体的尺寸。为了简化示例,假设这三个量是 0 到 15 之间的整数。然后我们可以使用 4 位二进制数来描述它们中的每一个。潜在解决方案的一个例子可能是基因型 0001 0111 01010。相应的表型是长 1、高 7 和宽 10 的平行六面体。
现在回到关于多样性的最初问题,除了 DonAndre 所说的之外,您还可以阅读优秀书籍的第 9 章“多模态问题和空间分布”进化计算简介,由 AE Eiben 和 JE Smith 撰写。以及关于此问题的研究论文,例如 鼓励进化机器人中的行为多样性:经验主义研究。 总而言之,多样性并不是 GA 的缺点,它“只是”一个问题。
As a sidenote, you (the OP) said:
This is not always true. An individual is coded as a genotype, which can have any shape, such as a string (genetic algorithms) or a vector of real (evolution strategies). Each genotype is transformed into a phenotype when assessing the individual, i.e. when its fitness is calculated. In some cases, the phenotype is identical to the genotype: it is called direct coding. Otherwise, the coding is called indirect. (you may find more definitions here (section 2.2.1))
Example of direct encoding:
http://en.wikipedia.org/wiki/Neuroevolution#Direct_and_Indirect_Encoding_of_Networks
Example of indirect encoding:
Suppose you want to optimize the size of a rectangular parallelepiped dened by its length, height and width. To simplify the example, assume that these three quantities are integers between 0 and 15. We can then describe each of them using a 4-bit binary number. An example of a potential solution may be to genotype 0001 0111 01010. The corresponding phenotype is a parallelepiped of length 1, height 7 and width 10.
Now back to the original question on diversity, in addition to what DonAndre said you could read you read chapter 9 "Multi-Modal Problems and Spatial Distribution" of the excellent book Introduction to Evolutionary Computing written by A. E. Eiben and J. E. Smith. as well as a research paper on that matter such as Encouraging Behavioral Diversity in Evolutionary Robotics: an Empirical Study. In a word, diversity is not a drawback of GA, it is "just" an issue.