如何更快地调试蒙特卡罗模拟?

发布于 2024-10-20 12:24:56 字数 432 浏览 8 评论 0原文

我正在编写一个模拟退火程序,但在调试它时遇到了一些问题。欢迎任何建议。

首先,输出不是确定性的,因此我已经运行了一百次并查看平均值和标准差。

但完成一个测试用例需要很长时间(> 30 分钟)。

通常,我会尝试减少输入,但减少迭代次数会直接降低结果的准确性,而这是无法完全预测的。例如,冷却计划是与迭代次数成比例的指数衰减。减少单独运行的数量会使输出非常不可靠(我试图找出的错误之一是运行之间的巨大差异)。

我知道过早的优化是万恶之源,当然在程序正确之前进行优化肯定是过早的,但我正在认真考虑重写这是一种更快的语言(Cython 或 C),因为我知道我必须这样做最后移植回Python提交。

那么,有没有什么方法可以比我现在所拥有的更好地测试模拟退火算法呢?或者我应该在测试之间做点别的事情?

披露:这是课程作业,但我并不是要求您帮助我进行实际模拟。

I'm writing a simulated annealing program, and having some trouble debugging it. Any advice would be welcome.

First of all, the output is not deterministic, so I've taken to running it a hundred times and looking at the mean and standard deviation.

But then it takes ages and ages (> 30 minutes) to finish a single test case.

Usually, I'd try to cut the input down, but reducing the number of iterations directly decreases the accuracy of the result in ways which are not entirely predictable. For example, the cooling schedule is an exponential decay scaled to the number of iterations. Reducing the number of separate runs makes the output very unreliable (one of the bugs I'm trying to hunt down is an enormous variance between runs).

I know premature optimization is the root of all evil, and surely it must be premature to optimize before the program is even correct, but I'm seriously thinking of rewriting this is a faster language (Cython or C) knowing I'd have to port it back to Python in the end for submission.

So, is there any way to test a simulated annealing algorithm better than what I have now? Or should I just work on something else between tests?

Disclosure: this is an assignment for coursework, but I'm not asking you to help me with the actual simulation.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

給妳壹絲溫柔 2024-10-27 12:24:56

以下是我在 Drools Planner(java,开源):

  1. 它支持不同的environmentModeDEBUGREPRODUCIBLE(默认)和>生产)。在DEBUGREPRODUCIBLE模式下,所有代码都使用相同的Random实例,并且Random实例seed是已修复。因此,运行两次禁忌搜索实现会给出完全相同的动作、步骤和结果分数。模拟退火的实现依赖于时间梯度,因此根据当时的CPU,可能会有轻微的差异。注意:这并不能免除您进行统计运行的责任,但它确实使单次运行具有可重复性。
  2. 良好的、可读的日志记录。使用日志记录级别。不要记录得太详细。
  3. 我们的构建服务器(Hudson)保存性能统计数据。
  4. 有一个 Benchmarker 工具,它可以输出图表,以便更容易地了解算法的作用。因此,不要只看结果,还要看它是如何实现的。一开始它可能做得很好,但后来陷入了局部最优。

基准测试摘要

基准测试详细信息

这样的事情可以让您深入了解算法中实际发生的情况。

另外,请参阅我关于模拟退火。

Here are a couple of debugging tricks I learned from implementing meta-heuristics (like Simulated Annealing and Tabu Search) in Drools Planner (java, open source):

  1. It supports different environmentModes (DEBUG, REPRODUCIBLE (default) and PRODUCTION). In mode DEBUG and REPRODUCIBLE, all code uses the same Random instance and the Random instance seed is fixed. Therefor running a tabu search implementation twice gives exactly the same moves, steps and resulting score. The simulated annealing implementation relies on the time gradient, so depending on the CPU at the time, there might be a slight difference. Note: this doesn't excuse you from statistical runnings, but it does make a single run reproducible.
  2. Good, readable logging. Use the logging levels. Do not log too verbose.
  3. Our build server (Hudson) keeps performance statistics.
  4. There is a Benchmarker tool which output diagrams to make it easier to see what the algorithm did. So don't just look at the result, but also look at how it got there. It might have done great at first, but then got stuck in a local optima later.

Benchmarker summary

Benchmarker detail

It's things that like which give you insight on what's actually going on in the algorithm.

Also, see my blog article on Simulated Annealing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文