寻找想法/参考/关键词:搜索算法的自适应参数控制(在线学习)

发布于 2024-10-04 03:41:10 字数 1109 浏览 5 评论 0原文

我正在寻找有关搜索算法参数(在线学习)的自适应参数控制的想法/经验/参考/关键字组合优化。

更详细一点:

我有一个框架,负责优化硬组合优化问题。这是在一些以迭代方式使用的“小启发式”的帮助下完成的(大邻域搜索;破坏和重建方法)。这些“小启发式”的每个算法都采用一些外部参数,这些参数在某种程度上控制启发式逻辑(目前:只是随机值;某种噪声;使搜索多样化)。

现在我想要一个控制框架,以尽可能通用的方式以改进收敛的方式选择这些参数,以便以后可以在不更改参数控制的情况下添加新的启发式方法。

至少需要做出两个一般决策:

  • A:选择在下一次迭代中使用的算法对(一个破坏算法和一个重建算法)。
  • B:选择算法的随机参数。

唯一的反馈是新发现的解决方案的评估函数。这让我想到了强化学习的主题。这是正确的方向吗?

这并不是真正的学习行为,但目前的简单化想法是:

  • A:根据迭代过程中收集的一些性能值进行轮盘赌选择(近过去的比旧的更有价值)。 因此,如果启发式 1 确实找到了所有新的全局最佳解决方案 ->选择这个的概率很大。
  • B:还不知道。也许可以在 (0,1) 范围内使用一些非均匀随机值,并且我正在收集一些变化的动量。 因此,如果启发式 1 上次使用 alpha = 0.3 并没有找到新的最佳解决方案,则使用 0.6 并找到新的最佳解决方案 ->有趋向 1 的势头 ->下一个随机值可能大于 0.3。可能的问题:振荡

备注事项: - 一种特定算法良好收敛所需的参数可能会发生巨大变化 ->也许一开始需要更加多元化的经营,最后需要更加集约化的经营。 - 在一对特定的破坏/重建算法(有时称为:耦合邻域)中可能会产生良好的协同效应。人们如何认识这样的东西呢?这仍然属于强化学习领域吗? - 不同的算法由不同数量的参数控制(有些取 1,有些取 3)。

有什么想法、经验、参考文献(论文)、关键词(ml-topics)吗?
如果对(b)的决策有以离线学习方式的想法。请毫不犹豫地提及这一点。

感谢您的所有意见。

萨沙

I'm looking for ideas/experiences/references/keywords regarding an adaptive-parameter-control of search algorithm parameters (online-learning) in combinatorial-optimization.

A bit more detail:

I have a framework, which is responsible for optimizing a hard combinatorial-optimization-problem. This is done with the help of some "small heuristics" which are used in an iterative manner (large-neighborhood-search; ruin-and-recreate-approach). Every algorithm of these "small heuristics" is taking some external parameters, which are controlling the heuristic-logic in some extent (at the moment: just random values; some kind of noise; diversify the search).

Now i want to have a control-framework for choosing these parameters in a convergence-improving way, as general as possible, so that later additions of new heuristics are possible without changing the parameter-control.

There are at least two general decisions to make:

  • A: Choose the algorithm-pair (one destroy- and one rebuild-algorithm) which is used in the next iteration.
  • B: Choose the random parameters of the algorithms.

The only feedback is an evaluation-function of the new-found-solution. That leads me to the topic of reinforcement-learning. Is that the right direction?

Not really a learning-like-behavior, but the simplistic ideas at the moment are:

  • A: A roulette-wheel-selection according to some performance-value collected during the iterations (near past is more valued than older ones).
    So if heuristic 1 did find all the new global best solutions -> high probability of choosing this one.
  • B: No idea yet. Maybe it's possible to use some non-uniform random values in the range (0,1) and i'm collecting some momentum of the changes.
    So if heuristic 1 last time used alpha = 0.3 and found no new best solution, then used 0.6 and found a new best solution -> there is a momentum towards 1
    -> next random value is likely to be bigger than 0.3. Possible problems: oscillation!

Things to remark:
- The parameters needed for good convergence of one specific algorithm can change dramatically -> maybe more diversify-operations needed at the beginning, more intensify-operations needed at the end.
- There is a possibility of good synergistic-effects in a specific pair of destroy-/rebuild-algorithm (sometimes called: coupled neighborhoods). How would one recognize something like that? Is that still in the reinforcement-learning-area?
- The different algorithms are controlled by a different number of parameters (some taking 1, some taking 3).

Any ideas, experiences, references (papers), keywords (ml-topics)?
If there are ideas regarding the decision of (b) in a offline-learning-manner. Don't hesitate to mention that.

Thanks for all your input.

Sascha

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

金橙橙 2024-10-11 03:41:10

您有一组参数变量,可用于控制一组算法。算法的选择只是另一个变量。

您可能需要考虑的一种方法是使用遗传算法来演化您的“参数空间”。简而言之,遗传算法使用自然选择过程的模拟来不断培育更好的解决方案。

您将需要开发一种编码方案来将参数空间表示为字符串,然后创建大量候选解决方案作为您的起始代。遗传算法本身采用集合中最适合的解决方案,然后对它们应用各种遗传算子(突变、繁殖等)以培育出更好的集合,然后成为下一代。

此过程中最困难的部分是开发适当的适应度函数:定量测量给定参数空间的质量。您的搜索问题可能太复杂,无法衡量总体中的每个候选者,因此您将需要一个代理模型函数,该函数可能与理想解决方案本身一样难以开发。

如果不进一步了解您所写的内容,就很难判断这种方法是否可行。遗传算法通常非常适合解决此类多变量优化问题,但它并不是灵丹妙药。如需参考,请从维基百科开始。

You have a set of parameter variables which you use to control your set of algorithms. Selection of your algorithms is just another variable.

One approach you might like to consider is to evolve your 'parameter space' using a genetic algorithm. In short, GA uses an analogue of the processes of natural selection to successively breed ever better solutions.

You will need to develop an encoding scheme to represent your parameter space as a string, and then create a large population of candidate solutions as your starting generation. The genetic algorithm itself takes the fittest solutions in your set and then applies various genetic operators to them (mutation, reproduction etc.) to breed a better set which then become the next generation.

The most difficult part of this process is developing an appropriate fitness function: something to quantitatively measure the quality of a given parameter space. Your search problem may be too complex to measure for each candidate in the population, so you will need a proxy model function which might be as hard to develop as the ideal solution itself.

Without understanding more of what you've written it's hard to see whether this approach is viable or not. GA is usually well suited to multi-variable optimisation problems like this, but it's not a silver bullet. For a reference start with Wikipedia.

风尘浪孓 2024-10-11 03:41:10

这听起来像是您正在尝试做的超启发式。尝试寻找该关键字。

Drools Planner (开源,java)中,我支持禁忌搜索和模拟退火盒子。
我还没有实现“破坏并重新创建”方法(尚未),但这应该很容易,尽管我并不期待更好的结果。挑战:证明我错了,分叉它,添加它,并在示例中击败我。
超启发式方法在我的待办事项列表中。

This sounds like hyper heuristics which you're trying to do. Try looking for that keyword.

In Drools Planner (open source, java) I have support for tabu search and simulated annealing out the box.
I haven't implemented the ruin-and-recreate-approach (yet), but that should be easy, although I am not expecting better results. Challenge: Prove me wrong and fork it and add it and beat me in the examples.
Hyper heuristics are on my TODO list.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文