模拟退火和遗传算法有什么区别?

发布于 2024-09-30 09:32:58 字数 161 浏览 12 评论 0原文

模拟退火(使用 bean 搜索)和遗传算法在性能和用例方面有哪些相关差异?

我知道 SA 可以被认为是人口规模只有 1 的 GA,但我不知道两者之间的关键区别。

另外,我正在尝试考虑 SA 将优于 GA 或 GA 将优于 SA 的情况。只需一个简单的例子来帮助我理解就足够了。

What are the relevant differences, in terms of performance and use cases, between simulated annealing (with bean search) and genetic algorithms?

I know that SA can be thought as GA where the population size is only one, but I don't know the key difference between the two.

Also, I am trying to think of a situation where SA will outperform GA or GA will outperform SA. Just one simple example which will help me understand will be enough.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

掀纱窥君容 2024-10-07 09:32:58

严格来说,模拟退火(SA)和遗传算法这两个东西既不是算法,也不是它们的目的“数据挖掘”。

两者都是元启发式——在抽象尺度上比“算法”高出几个层次。换句话说,这两个术语都指的是高级隐喻——一个借用冶金学,另一个借用进化生物学。在元启发式分类法中,SA 是一种单状态方法,而 GA 是一种群体方法(与 PSO、ACO 等属于一个子类,通常称为受生物学启发的元启发式)。

这两种元启发式方法用于解决优化问题,特别是(但不限于)组合优化(又名约束满足编程)。组合优化是指通过从一组离散项中进行选择来进行优化,换句话说,不存在可最小化的连续函数。背包问题、旅行商问题、库存问题——都是组合优化问题。

与数据挖掘的联系是,许多(大多数?)监督机器学习(ML)算法的核心是优化问题的解决方案——(例如多层感知器和支持向量机)。

任何解决上限问题的解决方案技术,无论采用何种算法,都基本上由以下步骤组成(通常在递归循环中编码为单个块):

  1. 对特定于域的详细信息进行编码
    在成本函数中(它是
    逐步使值最小化
    从该函数返回
    构成 c/o 的“解决方案”
    问题);

  2. 评估成本函数传递
    在最初的“猜测”中(开始
    iteration);

  3. 基于从返回的值
    成本函数,生成后续的
    候选解(或超过
    一、取决于
    元启发式)到成本
    函数;

  4. 通过以下方式评估每个候选解决方案
    将其传递到参数集中
    成本函数;

  5. 重复步骤(iii)和(iv)直到
    某个收敛标准是
    满足或最大数量
    达到迭代次数。

元启发法针对上述步骤(iii);因此,SA 和 GA 的不同之处在于它们如何生成候选解决方案以通过成本函数进行评估。换句话说,这是了解这两种元启发法有何不同的地方。

通俗地说,针对组合优化解的算法的本质是如何处理从成本函数返回的值比当前最佳候选解更差的候选解(从成本函数返回最低值的函数)。优化算法处理此类候选解决方案的最简单方法是彻底拒绝它——这就是爬山算法的作用。但通过这样做,简单的爬山总是会错过与当前解决方案相隔一座山的更好的解决方案。换句话说,复杂的优化算法必须包括一种(暂时)接受比当前最佳解决方案更差(即上坡)的候选解决方案的技术,因为比当前解决方案更好的解决方案可能位于通过该更差解决方案的路径上解决方案。


那么SA和GA如何生成候选解呢?

SA的本质通常用更高成本的候选解决方案被接受的概率来表达(双括号内的整个表达式是一个指数:

p = e((-highCost - lowCost)/temperature)

或者在Python中:

p = pow(math.e, (-hiCost - loCost) / T)

“温度”项是一个变量,其值在优化的进展 - 因此,随着迭代次数的增加,SA 接受更差解决方案的概率会降低。

换句话说,当算法开始迭代时,T 非常大,正如您所看到的,这会导致算法移动。对于每个新创建的候选解决方案,无论是比当前最佳解决方案更好还是更差——即,随着迭代次数的增加(即,随着温度的冷却),它都会在解决方案空间中进行随机游走。 0 时,其行为与简单的爬山算法相同(即,仅接受优于当前最佳解决方案的解决方案)。

算法对解空间的搜索变得不那么宽松,直到 T = 算法有很大不同。一方面(这是一件大事),它生成的不是单个候选解决方案,而是整个“总体”。它的工作原理如下:GA 对群体的每个成员(候选解决方案)调用成本函数。然后,它按照成本函数返回的值(“最佳”具有最低值)对它们进行从最好到最差的排序。根据这些排名值(及其相应的候选解决方案)创建下一个群体。人口的新成员基本上是通过以下三种方式之一创建的。第一种通常被称为“精英主义”,在实践中通常指的是仅采用排名最高的候选解决方案并将其直接传递给下一代(不加修改)。种群新成员的另外两种方式通常被称为“突变”和“交叉”。突变通常涉及改变当前群体的候选解向量中的一个元素以在新群体中创建解向量,例如,[4,5,1,0,2]=>1。 [4,5,2,0,2]。交叉操作的结果就像向量可以有性别时会发生的情况一样,即一个新的子向量,其元素由来自两个父向量中的每一个的一些元素组成。

这些就是 GA 和 SA 之间的算法差异。性能上的差异又如何呢?

在实践中:(我的观察仅限于组合优化问题)GA 几乎总是击败 SA(从成本函数返回较低的“最佳”返回值,即接近解决方案空间的全局最小值的值),但更高计算成本。据我所知,教科书和技术出版物对分辨率的结论是相同的。

但事情是这样的:GA 本质上是可并行的;更重要的是,这样做很简单,因为组成每个群体的各个“搜索代理”不需要交换消息,即它们彼此独立地工作。 ,你可以获得更好的结果(更接近全局最小值)和更好的性能(执行速度)。

显然这意味着GA计算可以是分布式的,这意味着在实践中 SA在什么情况下可能优于GA?我认为一般情况是那些解决方案空间较小的优化问题,因此 SA 和 GA 的结果几乎相同,但执行上下文(例如,以批处理模式运行的数百个类似问题)有利于更快的算法(其中应始终为 SA)。

Well strictly speaking, these two things--simulated annealing (SA) and genetic algorithms are neither algorithms nor is their purpose 'data mining'.

Both are meta-heuristics--a couple of levels above 'algorithm' on the abstraction scale. In other words, both terms refer to high-level metaphors--one borrowed from metallurgy and the other from evolutionary biology. In the meta-heuristic taxonomy, SA is a single-state method and GA is a population method (in a sub-class along with PSO, ACO, et al, usually referred to as biologically-inspired meta-heuristics).

These two meta-heuristics are used to solve optimization problems, particularly (though not exclusively) in combinatorial optimization (aka constraint-satisfaction programming). Combinatorial optimization refers to optimization by selecting from among a set of discrete items--in other words, there is no continuous function to minimize. The knapsack problem, traveling salesman problem, cutting stock problem--are all combinatorial optimization problems.

The connection to data mining is that the core of many (most?) supervised Machine Learning (ML) algorithms is the solution of an optimization problem--(Multi-Layer Perceptron and Support Vector Machines for instance).

Any solution technique to solve cap problems, regardless of the algorithm, will consist essentially of these steps (which are typically coded as a single block within a recursive loop):

  1. encode the domain-specific details
    in a cost function (it's the
    step-wise minimization of the value
    returned from this function that
    constitutes a 'solution' to the c/o
    problem);

  2. evaluate the cost function passing
    in an initial 'guess' (to begin
    iteration);

  3. based on the value returned from the
    cost function, generate a subsequent
    candidate solution (or more than
    one, depending on the
    meta-heuristic) to the cost
    function;

  4. evaluate each candidate solution by
    passing it in an argument set, to
    the cost function;

  5. repeat steps (iii) and (iv) until
    either some convergence criterion is
    satisfied or a maximum number of
    iterations is reached.

Meta-heuristics are directed to step (iii) above; hence, SA and GA differ in how they generate candidate solutions for evaluation by the cost function. In other words, that's the place to look to understand how these two meta-heuristics differ.

Informally, the essence of an algorithm directed to solution of combinatorial optimization is how it handles a candidate solution whose value returned from the cost function is worse than the current best candidate solution (the one that returns the lowest value from the cost function). The simplest way for an optimization algorithm to handle such a candidate solution is to reject it outright--that's what the hill climbing algorithm does. But by doing this, simple hill climbing will always miss a better solution separated from the current solution by a hill. Put another way, a sophisticated optimization algorithm has to include a technique for (temporarily) accepting a candidate solution worse than (i.e., uphill from) the current best solution because an even better solution than the current one might lie along a path through that worse solution.


So how do SA and GA generate candidate solutions?

The essence of SA is usually expressed in terms of the probability that a higher-cost candidate solution will be accepted (the entire expression inside the double parenthesis is an exponent:

p = e((-highCost - lowCost)/temperature)

Or in python:

p = pow(math.e, (-hiCost - loCost) / T)

The 'temperature' term is a variable whose value decays during progress of the optimization--and therefore, the probability that SA will accept a worse solution decreases as iteration number increases.

Put another way, when the algorithm begins iterating, T is very large, which as you can see, causes the algorithm to move to every newly created candidate solution, whether better or worse than the current best solution--i.e., it is doing a random walk in the solution space. As iteration number increases (i.e., as the temperature cools) the algorithm's search of the solution space becomes less permissive, until at T = 0, the behavior is identical to a simple hill-climbing algorithm (i.e., only solutions better than the current best solution are accepted).

Genetic Algorithms are very different. For one thing--and this is a big thing--it generates not a single candidate solution but an entire 'population of them'. It works like this: GA calls the cost function on each member (candidate solution) of the population. It then ranks them, from best to worse, ordered by the value returned from the cost function ('best' has the lowest value). From these ranked values (and their corresponding candidate solutions) the next population is created. New members of the population are created in essentially one of three ways. The first is usually referred to as 'elitism' and in practice usually refers to just taking the highest ranked candidate solutions and passing them straight through--unmodified--to the next generation. The other two ways that new members of the population are usually referred to as 'mutation' and 'crossover'. Mutation usually involves a change in one element in a candidate solution vector from the current population to create a solution vector in the new population, e.g., [4, 5, 1, 0, 2] => [4, 5, 2, 0, 2]. The result of the crossover operation is like what would happen if vectors could have sex--i.e., a new child vector whose elements are comprised of some from each of two parents.

So those are the algorithmic differences between GA and SA. What about the differences in performance?

In practice: (my observations are limited to combinatorial optimization problems) GA nearly always beats SA (returns a lower 'best' return value from the cost function--ie, a value close to the solution space's global minimum), but at a higher computation cost. As far as i am aware, the textbooks and technical publications recite the same conclusion on resolution.

but here's the thing: GA is inherently parallelizable; what's more, it's trivial to do so because the individual "search agents" comprising each population do not need to exchange messages--ie, they work independently of each other. Obviously that means GA computation can be distributed, which means in practice, you can get much better results (closer to the global minimum) and better performance (execution speed).

In what circumstances might SA outperform GA? The general scenario i think would be those optimization problems having a small solution space so that the result from SA and GA are practically the same, yet the execution context (e.g., hundreds of similar problems run in batch mode) favors the faster algorithm (which should always be SA).

凑诗 2024-10-07 09:32:58

比较两者确实很困难,因为它们受到不同领域的启发。

遗传算法维护一组可能的解决方案,并在每一步选择可能的解决方案对,将它们组合(交叉),并应用一些随机变化(突变)。该算法基于“适者生存”的思想,其中选择过程是根据适应度标准完成的(通常在优化问题中,它只是使用当前解决方案评估的目标函数的值)。完成交叉是希望两个好的解决方案结合起来可以提供更好的解决方案。

另一方面,模拟退火仅跟踪可能解决方案空间中的一个解决方案,并且在每次迭代时根据一些概率(随着时间的推移而衰减)考虑是移动到相邻的解决方案还是停留在当前的解决方案中。这与启发式搜索(例如贪婪搜索)不同,因为它不会遇到局部最优问题,因为它可以摆脱所有相邻解决方案都比当前解决方案最差的情况。

It is really difficult to compare the two since they were inspired from different domains..

A Genetic Algorithm maintains a population of possible solutions, and at each step, selects pairs of possible solution, combines them (crossover), and applies some random changes (mutation). The algorithm is based the idea of "survival of the fittest" where the selection process is done according to a fitness criteria (usually in optimization problems it is simply the value of the objective function evaluated using the current solution). The crossover is done in hope that two good solutions, when combined, might give even better solution.

On the other hand, Simulated Annealing only tracks one solution in the space of possible solutions, and at each iteration considers whether to move to a neighboring solution or stay in the current one according to some probabilities (which decays over time). This is different from a heuristic search (say greedy search) in that it doesn't suffer from the problems of local optimum since it can get unstuck from cases where all neighboring solutions are worst the current one.

—━☆沉默づ 2024-10-07 09:32:58

我远不是这些算法的专家,但我会尽力提供帮助。

我认为两者之间最大的区别是 GA 中交叉的想法,因此任何比 SA 更适合 GA 的学习任务示例都取决于交叉在这种情况下的含义及其实现方式。

交叉的想法是,您可以有意义地组合两种解决方案以产生更好的解决方案。我认为只有当问题的解决方案以某种方式构建时这才有意义。例如,我可以想象,在多类分类中,采用两个(或多个)擅长对特定类进行分类的分类器,并通过投票将它们组合起来以形成更好的分类器。另一个例子可能是遗传编程,其中解决方案可以表示为树,但我发现它很难想出一个很好的例子来结合两个程序来创建一个更好的程序。

我认为很难为其中一个提出令人信服的案例,因为它们确实是非常相似的算法,也许是从非常不同的起点开发的。

I'm far from an expert on these algorithms, but I'll try and help out.

I think the biggest difference between the two is the idea of crossover in GA and so any example of a learning task that is better suited to GA than SA is going to hinge on what crossover means in that situation and how it is implemented.

The idea of crossover is that you can meaningfully combine two solutions to produce a better one. I think this only makes sense if the solutions to a problem are structured in some way. I could imagine, for example, in multi-class classification taking two (or many) classifiers that are good at classifying a particular class and combining them by voting to make a much better classifier. Another example might be Genetic Programming, where the solution can be expressed as a tree, but I find it hard to come up with a good example where you could combine two programs to create a better one.

I think it's difficult to come up with a compelling case for one over the other because they really are quite similar algorithms, perhaps having been developed from very different starting points.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文