我应该如何测试遗传算法
我做过不少遗传算法; 他们有效(他们很快找到合理的解决方案)。 但我现在发现了TDD。 有没有办法以 TDD 方式编写遗传算法(严重依赖随机数) ?
更笼统地提出这个问题,如何测试非确定性方法/函数。 这是我的想法:
使用特定的种子。 如果我一开始在代码中犯了错误,这不会有帮助,但会有助于在重构时发现错误。
使用已知的数字列表。 与上面类似,但我可以手动跟踪代码(这将非常乏味)。
使用常数。 至少我知道会发生什么。 当 RandomFloat(0,1) 始终返回 1 时,最好确保骰子始终读取 6。
尝试将尽可能多的非确定性代码移出 GA。 这看起来很愚蠢,因为这是其目的的核心。
关于测试的非常好的书籍的链接也将不胜感激。
I have made a quite few genetic algorithms; they work (they find a reasonable solution quickly). But I have now discovered TDD. Is there a way to write a genetic algorithm (which relies heavily on random numbers) in a TDD way?
To pose the question more generally, How do you test a non-deterministic method/function. Here is what I have thought of:
Use a specific seed. Which wont help if I make a mistake in the code in the first place but will help finding bugs when refactoring.
Use a known list of numbers. Similar to the above but I could follow the code through by hand (which would be very tedious).
Use a constant number. At least I know what to expect. It would be good to ensure that a dice always reads 6 when RandomFloat(0,1) always returns 1.
Try to move as much of the non-deterministic code out of the GA as possible. which seems silly as that is the core of it's purpose.
Links to very good books on testing would be appreciated too.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
在我看来,测试其一致逻辑的唯一方法是应用一致输入,...或将每次迭代视为单个自动机其状态在迭代之前和之后进行测试,将整个非确定性系统转变为基于确定性迭代值的可测试组件。
对于迭代中的变异/育种/属性继承,在每次迭代的边界上测试这些值,并根据成功迭代子测试的已知输入/输出测试所有迭代的全局输出...
因为该算法是迭代的,您可以使用 <在测试中进行归纳,以确保它适用于 1 次迭代,n+1 次迭代,以证明对于给定的输入范围/域以及对可能值的约束,它将产生正确的结果(无论数据确定性如何)输入。
编辑我发现了这个测试非确定性系统的策略< /a> 这可能会提供一些见解。 一旦 TDD/开发过程证明逻辑是合理的,它可能有助于实时结果的统计分析。
Seems to me that the only way to test its consistent logic is to apply consistent input, ... or treat each iteration as a single automaton whose state is tested before and after that iteration, turning the overall nondeterministic system into testable components based on deterministic iteration values.
For variations/breeding/attribute inheritance in iterations, test those values on the boundaries of each iteration and test the global output of all iterations based on known input/output from successful iteration-subtests ...
Because the algorithm is iterative you can use induction in your testing to ensure it works for 1 iteration, n+1 iterations to prove it will produce correct results (regardless of data determinism) for a given input range/domain and the constraints on possible values in the input.
Edit I found this strategies for testing nondeterministic systems which might provide some insight. It might be helpful for statistical analysis of live results once the TDD/development process proves the logic is sound.
我会通过多次测试随机函数并分析返回值的分布是否符合统计期望(这涉及到一些统计知识)来测试随机函数。
I would test random functions by testing them a number of times and analyzing whether the distribution of return values meets the statistical expectations (this involves some statistical knowledge).
如果您谈论的是 TDD,我想说肯定是从选择一个常数开始,然后从那里扩展您的测试套件。 我已经对一些高度数学的问题进行了 TDD,这有助于拥有一些您知道并从一开始就手动计算出来的常量案例。
W/R/T 你的第四点,将不确定性代码移出 GA,我认为这可能是一个值得考虑的方法。 如果您可以分解算法并分离不确定性问题,那么确定性部分的测试应该会变得简单。 只要你小心地命名事物,我认为你就不会在这里做出太多牺牲。 除非我误解了你的意思,否则 GA 仍然会委托给这段代码,但它位于其他地方。
至于关于(开发人员)测试的非常好的书籍的链接,我最喜欢的是:
If you're talking TDD, I would say definitely start out by picking a constant number and growing your test suite from there. I've done TDD on a few highly mathematical problems and it helps to have a few constant cases you know and have worked out by hand to run with from the beginning.
W/R/T your 4th point, moving nondeterministic code out of the GA, I think this is probably an approach worth considering. If you can decompose the algorithm and separate the nondeterministic concerns, it should make testing the deterministic parts straightforward. As long as you're careful about how you name things I don't think that you're sacrificing much here. Unless I am misunderstanding you, the GA will still delegate to this code, but it lives somewhere else.
As far as links to very good books on (developer) testing my favorites are:
我对 GA 算法的非确定性函数进行单元测试的一种方法是将随机数的选择放入使用该随机数的逻辑函数的不同函数中。
例如,如果您有一个函数,它接受一个基因(某物的向量)并获取该基因的两个随机点来对它们执行某些操作(突变或其他),您可以将随机数的生成放在一个函数中,并且然后将它们与基因一起传递给另一个函数,该函数包含给定数字的逻辑。
通过这种方式,您可以使用逻辑函数进行 TDD,并向其传递某些基因和某些数字,准确地知道逻辑应该对给定数字的基因执行什么操作,并能够在修改后的基因上编写断言。
另一种测试随机数生成的方法是将生成的随机数外部化到另一个类,该类可以通过上下文访问或从配置值加载,并使用不同的类进行测试执行。 该类将有两种实现,一种用于生成实际随机数的生产,另一种用于测试,它有办法接受稍后生成的数字。 然后在测试中,您可以提供该类将提供给测试代码的某些数字。
One way I do for unit testing of non-deterministic functions of GA algorithms is put the election of random numbers in a different function of the logic one that uses that random numbers.
For example, if you have a function that takes a gene (vector of something) and takes two random points of the gene to do something with them (mutation or whatever), you can put the generation of the random numbers in a function, and then pass them along with the gene to another function that contains the logic given that numbers.
This way you can do TDD with the logic function and pass it certain genes and certain numbers, knowing exactly what the logic should do on the gene given that numbers and being able to write asserts on the modified gene.
Another way, to test with the generation of random numbers is externalizing that generation to another class, that could be accessed via a context or loaded from a config value, and using a different one for test executions. There would be two implementations of that class, one for production that generates actual random numbers, and another for testing, that would have ways to accept the numbers that later it will generate. Then in the test you could provide that certain numbers that the class will supply to the tested code.
您可以编写一个冗余神经网络来分析算法的结果,并根据预期结果对输出进行排名。 :)
尽可能分解你的方法。 然后,您还可以仅围绕随机部分进行单元测试来检查值的范围。 甚至可以测试运行几次,看看结果是否发生变化。
You could write a redundant neural network to analyze the results from your algorithm and have the output ranked based on expected outcomes. :)
Break your method down as much as your can. Then you can also have a unit test around just the random part to check the range of values. Even have the test run it a few times to see if the result changes.
您的所有函数都应该是完全确定性的。 这意味着您正在测试的任何函数都不应在函数本身内部生成随机数。 您需要将其作为参数传递。 这样,当您的程序根据随机数做出决策时,您可以传入代表性数字来测试该数字的预期输出。 唯一不应该具有确定性的是您的实际随机数生成器,您实际上不需要太担心,因为您不应该自己编写它。 您应该能够假设只要它是一个已建立的库,它就可以工作。
那是为了你的单元测试。 对于集成测试,如果您这样做,您可能会考虑模拟随机数生成,将其替换为一种算法,该算法将为您需要生成的每个随机数返回 0..n 中的已知数字。
All of your functions should be completely deterministic. This means that none of the functions you are testing should generate the random number inside the function itself. You will want to pass that in as a parameter. That way when your program is making decisions based on your random numbers, you can pass in representative numbers to test the expected output for that number. The only thing that shouldn't be deterministic is your actual random number generator, which you don't really need to worry too much about because you shouldn't be writing this yourself. You should be able to just assume it works as long as its an established library.
That's for your unit tests. For your integration tests, if you are doing that, you might look into mocking your random number generation, replacing it with an algorithm that will return known numbers from 0..n for every random number that you need to generate.
我编写了一个 C# TDD 遗传算法教学应用程序:
http://code.google.com/p/evo-lisa-clone/< /a>
让我们采用应用程序中最简单的随机结果方法:PointGenetics.Create,它在给定边界的情况下创建一个随机点。 对于此方法,我使用了 5 个测试,并且没有一个测试依赖于特定种子:
http://code.google.com/p/evo-lisa-clone/source/browse/trunk/EvoLisaClone/EvoLisaCloneTest/PointGeneticsTest.cs
随机性测试很简单:对于大边界(许多可能性),两个连续生成的点不应该相等。 其余的测试检查其他约束。
I wrote a C# TDD Genetic Algorithm didactic application:
http://code.google.com/p/evo-lisa-clone/
Let's take the simplest random result method in the application: PointGenetics.Create, which creates a random point, given the boundaries. For this method I used 5 tests, and none of them relies on a specific seed:
http://code.google.com/p/evo-lisa-clone/source/browse/trunk/EvoLisaClone/EvoLisaCloneTest/PointGeneticsTest.cs
The randomness test is simple: for a large boundary (many possibilities), two consecutive generated points should not be equal. The remaining tests check other constraints.
最可测试的部分是适应度函数——你所有的逻辑都在其中。 在某些情况下,这可能非常复杂(您可能会根据输入参数运行各种模拟),因此您想确保所有这些东西都可以与大量单元测试一起使用,并且这项工作可以遵循任何方法。
关于测试 GA 参数(突变率、交叉策略等),如果你自己实现这些东西,你当然可以测试它(你可以再次围绕突变逻辑等进行单元测试),但你不会能够测试 GA 的“微调”。
换句话说,除了通过找到的解决方案的优劣之外,您将无法测试 GA 是否实际执行。
Well the most testable part is the fitness function - where all your logic will be. this can be in some cases quite complex (you might be running all sorts of simulations based on input parameters) so you wanna be sure all that stuff works with a whole lot of unit tests, and this work can follow whatever methodology.
With regards to testing the GA parameters (mutation rate, cross-over strategy, whatever) if you're implementing that stuff yourself you can certainly test it (you can again have unit tests around mutation logic etc.) but you won't be able to test the 'fine-tuning' of the GA.
In other words, you won't be able to test if GA actually performs other than by the goodness of the solutions found.
测试算法对于相同的输入给出相同的结果可能会对您有所帮助,但有时您会进行一些更改,从而改变算法的结果选取行为。
我会尽最大努力进行测试,以确保算法为您提供正确的结果。 如果该算法为您提供了多个静态种子和随机值的正确结果,则该算法将起作用,或者不会因所做的更改而被破坏。
TDD 的另一个机会是评估算法的可能性。 如果您可以自动检查结果有多好,您可以添加测试来表明更改不会降低结果的质量或不合理地增加计算时间。
如果你想用许多基础种子来测试你的算法,你可能需要测试一套套装,一套在每次保存后运行快速测试,以确保你没有破坏任何东西,一套套运行时间更长稍后的评估
A test that the algorithm gives you the same result for the same input could help you but sometimes you will make changes that change the result picking behavior of the algorithm.
I would make the most effort to have a test that ensures that the algorithm gives you a correct result. If the algorithm gives you a correct result for a number of static seeds and random values the algorithm works or is not broken through the changes made.
Another chance in TDD is the possibility to evaluate the algorithm. If you can automatically check how good a result is you could add tests that show that a change hasn't lowered the qualities of your results or increased your calculating time unreasonable.
If you want to test your algorithm with many base seeds you maybe want to have to test suits one suit that runs a quick test for starting after every save to ensure that you haven't broken anything and one suit that runs for a longer time for a later evaluation
我强烈建议您考虑在单元测试用例中使用模拟对象(http://en.wikipedia.org /wiki/Mock_object)。 您可以使用它们来模拟进行随机猜测的对象,以便获得预期的结果。
I would highly suggest looking into using mock objects for your unit test cases (http://en.wikipedia.org/wiki/Mock_object). You can use them to mock out objects that make random guesses in order to cause you to get expected results instead.