返回介绍

Newton’s method and variants

发布于 2025-02-25 23:43:54 字数 3956 浏览 0 评论 0 收藏 0

Recall Newton’s method for finding roots of a univariate function

\[x_{K+1} = x_k - \frac{f(x_k}{f'(x_k)}\]

When we are looking for a minimum, we are looking for the roots of the derivative, so

\[x_{K+1} = x_k - \frac{f'(x_k}{f''(x_k)}\]

Newotn’s method can also be seen as a Taylor series approximation

\[f(x+h) = f(x) + h f'(x) + \frac{h^2}{2}f''(x)\]

At the function minimum, the derivtive is 0, so

and letting \(\Delta x = \frac{h}{2}\), we get that the Newton stpe is

\[\Delta x = - \frac{f'(x)}{f''(x)}\]

The multivariate analog replaces \(f'\) with the Jacobian and \(f''\) with the Hessian, so the Newton step is

\[\Delta x = -H^{-1}(x) \nabla f(x)\]

Second order methods solve for \(H^{-1}\) and so require calculation of the Hessian (either provided or approximated uwing finite differences). For efficiency reasons, the Hessian is not directly inverted, but solved for using a variety of methods such as conjugate gradient. An example of a seocnd order method in the optimize package is Newton-GC .

from scipy.optimize import rosen, rosen_der, rosen_hess
ps = [x0]
opt.minimize(rosen, x0, method='Newton-CG', jac=rosen_der, hess=rosen_hess, callback=reporter)
 status: 0
success: True
   njev: 63
   nfev: 38
    fun: 1.3642782750354208e-13
      x: array([ 1.,  1.])
message: 'Optimization terminated successfully.'
   nhev: 26
    jac: array([  1.2120e-04,  -6.0850e-05])
ps = np.array(ps)
plt.figure(figsize=(12,4))
plt.subplot(121)
plt.contour(X, Y, Z, np.arange(10)**5)
plt.plot(ps[:, 0], ps[:, 1], '-o')
plt.subplot(122)
plt.semilogy(range(len(ps)), rosen(ps.T));

As calculating the Hessian is computationally expensive, first order methods only use the first derivatives. Quasi-Newton methods use functions of the first derivatives to approximate the inverse Hessian. A well know example of the Quasi-Newoton class of algorithjms is BFGS, named after the initials of the creators. As usual, the first derivatives can either be provided via the jac= argument or approximated by finite difference methods.

ps = [x0]
opt.minimize(rosen, x0, method='BFGS', callback=reporter)
  status: 2
 success: False
    njev: 92
    nfev: 379
hess_inv: array([[ 0.5004,  1.0009],
      [ 1.0009,  2.0072]])
     fun: 1.2922663663359423e-12
       x: array([ 1.,  1.])
 message: 'Desired error not necessarily achieved due to precision loss.'
     jac: array([  5.1319e-05,  -2.1227e-05])
ps = np.array(ps)
plt.figure(figsize=(12,4))
plt.subplot(121)
plt.contour(X, Y, Z, np.arange(10)**5)
plt.plot(ps[:, 0], ps[:, 1], '-o')
plt.subplot(122)
plt.semilogy(range(len(ps)), rosen(ps.T));

Finally, there are some optimization algorithms not based on the Newton method, but on other heuristic search strategies that do not require any derivatives, only function evaluations. One well-known example is the Nelder-Mead simplex algorithm.

ps = [x0]
opt.minimize(rosen, x0, method='nelder-mead', callback=reporter)
 status: 0
   nfev: 162
success: True
    fun: 5.262756878429089e-10
      x: array([ 1.,  1.])
message: 'Optimization terminated successfully.'
    nit: 85
ps = np.array(ps)
plt.figure(figsize=(12,4))
plt.subplot(121)
plt.contour(X, Y, Z, np.arange(10)**5)
plt.plot(ps[:, 0], ps[:, 1], '-o')
plt.subplot(122)
plt.semilogy(range(len(ps)), rosen(ps.T));

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文