- Introduction to Python
- Getting started with Python and the IPython notebook
- Functions are first class objects
- Data science is OSEMN
- Working with text
- Preprocessing text data
- Working with structured data
- Using SQLite3
- Using HDF5
- Using numpy
- Using Pandas
- Computational problems in statistics
- Computer numbers and mathematics
- Algorithmic complexity
- Linear Algebra and Linear Systems
- Linear Algebra and Matrix Decompositions
- Change of Basis
- Optimization and Non-linear Methods
- Practical Optimizatio Routines
- Finding roots
- Optimization Primer
- Using scipy.optimize
- Gradient deescent
- Newton’s method and variants
- Constrained optimization
- Curve fitting
- Finding paraemeters for ODE models
- Optimization of graph node placement
- Optimization of standard statistical models
- Fitting ODEs with the Levenberg–Marquardt algorithm
- 1D example
- 2D example
- Algorithms for Optimization and Root Finding for Multivariate Problems
- Expectation Maximizatio (EM) Algorithm
- Monte Carlo Methods
- Resampling methods
- Resampling
- Simulations
- Setting the random seed
- Sampling with and without replacement
- Calculation of Cook’s distance
- Permutation resampling
- Design of simulation experiments
- Example: Simulations to estimate power
- Check with R
- Estimating the CDF
- Estimating the PDF
- Kernel density estimation
- Multivariate kerndel density estimation
- Markov Chain Monte Carlo (MCMC)
- Using PyMC2
- Using PyMC3
- Using PyStan
- C Crash Course
- Code Optimization
- Using C code in Python
- Using functions from various compiled languages in Python
- Julia and Python
- Converting Python Code to C for speed
- Optimization bake-off
- Writing Parallel Code
- Massively parallel programming with GPUs
- Writing CUDA in C
- Distributed computing for Big Data
- Hadoop MapReduce on AWS EMR with mrjob
- Spark on a local mahcine using 4 nodes
- Modules and Packaging
- Tour of the Jupyter (IPython3) notebook
- Polyglot programming
- What you should know and learn more about
- Wrapping R libraries with Rpy
Newton’s method and variants
Recall Newton’s method for finding roots of a univariate function
\[x_{K+1} = x_k - \frac{f(x_k}{f'(x_k)}\]
When we are looking for a minimum, we are looking for the roots of the derivative, so
\[x_{K+1} = x_k - \frac{f'(x_k}{f''(x_k)}\]
Newotn’s method can also be seen as a Taylor series approximation
\[f(x+h) = f(x) + h f'(x) + \frac{h^2}{2}f''(x)\]
At the function minimum, the derivtive is 0, so
and letting \(\Delta x = \frac{h}{2}\), we get that the Newton stpe is
\[\Delta x = - \frac{f'(x)}{f''(x)}\]
The multivariate analog replaces \(f'\) with the Jacobian and \(f''\) with the Hessian, so the Newton step is
\[\Delta x = -H^{-1}(x) \nabla f(x)\]
Second order methods solve for \(H^{-1}\) and so require calculation of the Hessian (either provided or approximated uwing finite differences). For efficiency reasons, the Hessian is not directly inverted, but solved for using a variety of methods such as conjugate gradient. An example of a seocnd order method in the optimize
package is Newton-GC
.
from scipy.optimize import rosen, rosen_der, rosen_hess
ps = [x0] opt.minimize(rosen, x0, method='Newton-CG', jac=rosen_der, hess=rosen_hess, callback=reporter)
status: 0 success: True njev: 63 nfev: 38 fun: 1.3642782750354208e-13 x: array([ 1., 1.]) message: 'Optimization terminated successfully.' nhev: 26 jac: array([ 1.2120e-04, -6.0850e-05])
ps = np.array(ps) plt.figure(figsize=(12,4)) plt.subplot(121) plt.contour(X, Y, Z, np.arange(10)**5) plt.plot(ps[:, 0], ps[:, 1], '-o') plt.subplot(122) plt.semilogy(range(len(ps)), rosen(ps.T));
As calculating the Hessian is computationally expensive, first order methods only use the first derivatives. Quasi-Newton methods use functions of the first derivatives to approximate the inverse Hessian. A well know example of the Quasi-Newoton class of algorithjms is BFGS, named after the initials of the creators. As usual, the first derivatives can either be provided via the jac=
argument or approximated by finite difference methods.
ps = [x0] opt.minimize(rosen, x0, method='BFGS', callback=reporter)
status: 2 success: False njev: 92 nfev: 379 hess_inv: array([[ 0.5004, 1.0009], [ 1.0009, 2.0072]]) fun: 1.2922663663359423e-12 x: array([ 1., 1.]) message: 'Desired error not necessarily achieved due to precision loss.' jac: array([ 5.1319e-05, -2.1227e-05])
ps = np.array(ps) plt.figure(figsize=(12,4)) plt.subplot(121) plt.contour(X, Y, Z, np.arange(10)**5) plt.plot(ps[:, 0], ps[:, 1], '-o') plt.subplot(122) plt.semilogy(range(len(ps)), rosen(ps.T));
Finally, there are some optimization algorithms not based on the Newton method, but on other heuristic search strategies that do not require any derivatives, only function evaluations. One well-known example is the Nelder-Mead simplex algorithm.
ps = [x0] opt.minimize(rosen, x0, method='nelder-mead', callback=reporter)
status: 0 nfev: 162 success: True fun: 5.262756878429089e-10 x: array([ 1., 1.]) message: 'Optimization terminated successfully.' nit: 85
ps = np.array(ps) plt.figure(figsize=(12,4)) plt.subplot(121) plt.contour(X, Y, Z, np.arange(10)**5) plt.plot(ps[:, 0], ps[:, 1], '-o') plt.subplot(122) plt.semilogy(range(len(ps)), rosen(ps.T));
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论