遗传算法应用于曲线拟合
假设我有一个未知函数,我想通过遗传算法对其进行近似。对于本例,我假设 y = 2x。
我有一个由 5 个元素组成的 DNA,每个 x 一个 y,从 x = 0 到 x = 4,其中,经过大量的试验和计算,我会得到接近以下形式的结果:
best_adn = [ 0, 2, 4, 6, 8 ]
请记住,我事先不知道它是线性函数、多项式还是其他更难看的函数,另外,我的目标不是从 best_adn 推断出的类型是什么函数,我只想要这些点,以便稍后使用它们。
这只是一个示例问题。就我而言,DNA 中不是只有 5 个点,而是有 50 或 100 个点。使用 GA 找到最佳点集的最佳方法是什么?
- 产生100人口, 丢弃最差的20%
- 重新组合剩下的80%?如何? 在随机点切割它们并 然后将第一个放在一起 父亲的 ADN 的一部分 母亲 ADN 的第二部分?
- 突变,我该如何定义 这种问题突变?
- 值得使用精英主义吗?
- 任何其他值得使用的简单想法 大约?
谢谢
Let's imagine I have an unknown function that I want to approximate via Genetic Algorithms. For this case, I'll assume it is y = 2x.
I'd have a DNA composed of 5 elements, one y for each x, from x = 0 to x = 4, in which, after a lot of trials and computation and I'd arrive near something of the form:
best_adn = [ 0, 2, 4, 6, 8 ]
Keep in mind I don't know beforehand if it is a linear function, a polynomial or something way more ugly, Also, my goal is not to infer from the best_adn what is the type of function, I just want those points, so I can use them later.
This was just an example problem. In my case, instead of having only 5 points in the DNA, I have something like 50 or 100. What is the best approach with GA to find the best set of points?
- Generating a population of 100,
discard the worse 20% - Recombine the remaining 80%? How?
Cutting them at a random point and
then putting together the first
part of ADN of the father with the
second part of ADN of the mother? - Mutation, how should I define in
this kind of problem mutation? - Is it worth using Elitism?
- Any other simple idea worth using
around?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
通常你只能通过实验发现这些......也许编写一个 GA 来调整你的 GA。
但除此之外,我不明白你在问什么。如果你不知道功能是什么,也不知道相处的要点,你如何判断适合度呢?
从我目前对这个问题的理解来看,这更适合用神经网络来拟合。
编辑:
2.重新组合剩下的80%?如何?随机切割它们,然后将父亲的 ADN 的第一部分与母亲的 ADN 的第二部分放在一起?
这称为交叉。如果你想变得有趣,可以做一些事情,比如选择一个随机的起点并交换一个随机的长度。例如,一个对象中有 10 个元素。随机选择 1 到 10 之间的一个点 X 并交换 x..10-rand%10+1.. 你就明白了...稍微调味一下。
3.突变,这种问题突变我应该如何定义?
通常,这更多地取决于法律解决方案的定义,而不是其他任何因素。你可以像交叉一样进行突变,只不过你用随机数据填充它(这是合法的)而不是与另一个样本交换......并且你以更低的速率进行突变。
4.精英主义值得使用吗?
实验并找出答案。
Usually you only find these out by experimentation... perhaps writing a GA to tune your GA.
But that aside, I don't understand what you're asking. If you don't know what the function is, and you also don't know the points to being with, how do you determine fitness?
From my current understanding of the problem, this is better fitted by a neural network.
edit:
2.Recombine the remaining 80%? How? Cutting them at a random point and then putting together the first part of ADN of the father with the second part of ADN of the mother?
This is called crossover. If you want to be saucey, do something like pick a random starting point and swapping a random length. For instance, you have 10 elements in an object. randomly choose a spot X between 1 and 10 and swap x..10-rand%10+1.. you get the picture... spice it up a little.
3.Mutation, how should I define in this kind of problem mutation?
usually that depends more on what is defined as a legal solution than anything else. you can do mutation the same way you do crossover, except you fill it with random data (that is legal) rather than swapping with another specimen... and you do it at a MUCH lower rate.
4.Is it worth using Elitism?
experiment and find out.
高斯适应通常优于标准遗传算法。如果您不想从头开始编写自己的包,那么 Mathematica 全局优化包非常出色——我用它来拟合一个非常令人讨厌的非线性函数,其中标准拟合器惨遭失败。
编辑:
维基百科文章
如果您查找文章中列出的论文的印刷品,您可以找到白皮书和实施。但总的来说,您应该了解最大化适应度函数的解决方案空间是什么样的。如果变量的数量很少,或者局部最大值的数量很小,或者它们连接/倾斜到全局最大值,则简单的最小二乘法可以很好地工作。如果每个局部最大值周围的区域很小(也就是说,你必须得到一个非常好的解决方案才能达到最好的解决方案,否则你会得到一个糟糕的解决方案),那么就需要更高级的算法。
为遗传算法选择变量取决于解空间的样子。
Gaussian adaptation usually outperforms standard genetic algorithms. If you don't want to write your own package from scratch, the Mathematica Global Optimization package is EXCELLENT -- I used it to fit a really nasty nonlinear function where standard fitters failed miserably.
Edit:
Wikipedia Article
If you hunt down prints of the listed papers on the article, you can find whitepapers and implementations. In general though, you should have some idea what the solution space for your maximizing the fitness function look like. If the number of variables is small, or the number of local maxima is small or they are connected/slope down to a global maxima, simple least squares works fine. If the area around each local maxima is small (IE you have to get a damned good solution to hit the best one, otherwise you hit a bad one), then fancier algorithms are needed.
Choosing variables for a genetic algorithm depends on what the solution space will look like.