- Introduction to Python
- Getting started with Python and the IPython notebook
- Functions are first class objects
- Data science is OSEMN
- Working with text
- Preprocessing text data
- Working with structured data
- Using SQLite3
- Using HDF5
- Using numpy
- Using Pandas
- Computational problems in statistics
- Computer numbers and mathematics
- Algorithmic complexity
- Linear Algebra and Linear Systems
- Linear Algebra and Matrix Decompositions
- Change of Basis
- Optimization and Non-linear Methods
- Practical Optimizatio Routines
- Finding roots
- Optimization Primer
- Using scipy.optimize
- Gradient deescent
- Newton’s method and variants
- Constrained optimization
- Curve fitting
- Finding paraemeters for ODE models
- Optimization of graph node placement
- Optimization of standard statistical models
- Fitting ODEs with the Levenberg–Marquardt algorithm
- 1D example
- 2D example
- Algorithms for Optimization and Root Finding for Multivariate Problems
- Expectation Maximizatio (EM) Algorithm
- Monte Carlo Methods
- Resampling methods
- Resampling
- Simulations
- Setting the random seed
- Sampling with and without replacement
- Calculation of Cook’s distance
- Permutation resampling
- Design of simulation experiments
- Example: Simulations to estimate power
- Check with R
- Estimating the CDF
- Estimating the PDF
- Kernel density estimation
- Multivariate kerndel density estimation
- Markov Chain Monte Carlo (MCMC)
- Using PyMC2
- Using PyMC3
- Using PyStan
- C Crash Course
- Code Optimization
- Using C code in Python
- Using functions from various compiled languages in Python
- Julia and Python
- Converting Python Code to C for speed
- Optimization bake-off
- Writing Parallel Code
- Massively parallel programming with GPUs
- Writing CUDA in C
- Distributed computing for Big Data
- Hadoop MapReduce on AWS EMR with mrjob
- Spark on a local mahcine using 4 nodes
- Modules and Packaging
- Tour of the Jupyter (IPython3) notebook
- Polyglot programming
- What you should know and learn more about
- Wrapping R libraries with Rpy
文章来源于网络收集而来,版权归原创者所有,如有侵权请及时联系!
Kernel density estimation
Kernel density estimation is a form of convolution, usually with a symmetric kenrel (e.g. a Gaussian). The degree of smoothing is determined by a bandwidth parameter.
def epanechnikov(u): """Epanechnikov kernel.""" return np.where(np.abs(u) <= np.sqrt(5), 3/(4*np.sqrt(5)) * (1 - u*u/5.0), 0)
def silverman(y): """Find bandwidth using heuristic suggested by Silverman .9 min(standard deviation, interquartile range/1.34)n-1/5 """ n = len(y) iqr = np.subtract(*np.percentile(y, [75, 25])) h = 0.9*np.min([y.std(ddof=1), iqr/1.34])*n**-0.2 return h
def kde(x, y, bandwidth=silverman, kernel=epanechnikov): """Returns kernel density estimate. x are the points for evaluation y is the data to be fitted bandwidth is a function that returens the smoothing parameter h kernel is a function that gives weights to neighboring data """ h = bandwidth(y) return np.sum(kernel((x-y[:, None])/h)/h, axis=0)/len(y)
xs = np.linspace(-5,8,100) density = kde(xs, x) plt.plot(xs, density);
There are several kernel density estimation routines available in scipy, statsmodels and scikit-leran. Here we will use the scikits-learn and statsmodels routine as examples.
import statsmodels.api as sm dens = sm.nonparametric.KDEUnivariate(x) dens.fit(kernel='gau') plt.plot(xs, dens.evaluate(xs));
from sklearn.neighbors import KernelDensity # expects n x p matrix with p features x.shape = (len(x), 1) xs.shape = (len(xs), 1) kde = KernelDensity(kernel='epanechnikov', bandwidth=0.5).fit(x) dens = np.exp(kde.score_samples(xs)) plt.plot(xs, dens);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论