- Introduction to Python
- Getting started with Python and the IPython notebook
- Functions are first class objects
- Data science is OSEMN
- Working with text
- Preprocessing text data
- Working with structured data
- Using SQLite3
- Using HDF5
- Using numpy
- Using Pandas
- Computational problems in statistics
- Computer numbers and mathematics
- Algorithmic complexity
- Linear Algebra and Linear Systems
- Linear Algebra and Matrix Decompositions
- Change of Basis
- Optimization and Non-linear Methods
- Practical Optimizatio Routines
- Finding roots
- Optimization Primer
- Using scipy.optimize
- Gradient deescent
- Newton’s method and variants
- Constrained optimization
- Curve fitting
- Finding paraemeters for ODE models
- Optimization of graph node placement
- Optimization of standard statistical models
- Fitting ODEs with the Levenberg–Marquardt algorithm
- 1D example
- 2D example
- Algorithms for Optimization and Root Finding for Multivariate Problems
- Expectation Maximizatio (EM) Algorithm
- Monte Carlo Methods
- Resampling methods
- Resampling
- Simulations
- Setting the random seed
- Sampling with and without replacement
- Calculation of Cook’s distance
- Permutation resampling
- Design of simulation experiments
- Example: Simulations to estimate power
- Check with R
- Estimating the CDF
- Estimating the PDF
- Kernel density estimation
- Multivariate kerndel density estimation
- Markov Chain Monte Carlo (MCMC)
- Using PyMC2
- Using PyMC3
- Using PyStan
- C Crash Course
- Code Optimization
- Using C code in Python
- Using functions from various compiled languages in Python
- Julia and Python
- Converting Python Code to C for speed
- Optimization bake-off
- Writing Parallel Code
- Massively parallel programming with GPUs
- Writing CUDA in C
- Distributed computing for Big Data
- Hadoop MapReduce on AWS EMR with mrjob
- Spark on a local mahcine using 4 nodes
- Modules and Packaging
- Tour of the Jupyter (IPython3) notebook
- Polyglot programming
- What you should know and learn more about
- Wrapping R libraries with Rpy
文章来源于网络收集而来,版权归原创者所有,如有侵权请及时联系!
Broadcasting, row, column and matrix operations
# operations across rows, cols or entire matrix print(xs.max()) print(xs.max(axis=0)) # max of each col print(xs.max(axis=1)) # max of each row
137 [ 95 137 103 105 131 115] [115 111 85 0 105 119 113 131 137 81]
# A funcitonal rather than object-oriented approacha also wokrs print(np.max(xs, axis=0)) print(np.max(xs, axis=1))
[ 95 137 103 105 131 115] [115 111 85 0 105 119 113 131 137 81]
# broadcasting xs = np.arange(12).reshape(2,6) print(xs, '\n') print(xs * 10, '\n') # broadcasting just works when doing column-wise operations col_means = xs.mean(axis=0) print(col_means, '\n') print(xs + col_means, '\n') # but needs a little more work for row-wise operations row_means = xs.mean(axis=1)[:, np.newaxis] print(row_means) print(xs + row_means)
(array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11]]), 'n') (array([[ 0, 10, 20, 30, 40, 50], [ 60, 70, 80, 90, 100, 110]]), 'n') (array([ 3., 4., 5., 6., 7., 8.]), 'n') (array([[ 3., 5., 7., 9., 11., 13.], [ 9., 11., 13., 15., 17., 19.]]), 'n') [[ 2.5] [ 8.5]] [[ 2.5 3.5 4.5 5.5 6.5 7.5] [ 14.5 15.5 16.5 17.5 18.5 19.5]]
# convert matrix to have zero mean and unit standard deviation using col summary statistics print((xs - xs.mean(axis=0))/xs.std(axis=0))
[[-1. -1. -1. -1. -1. -1.] [ 1. 1. 1. 1. 1. 1.]]
# convert matrix to have zero mean and unit standard deviation using row summary statistics print((xs - xs.mean(axis=1)[:, np.newaxis])/xs.std(axis=1)[:, np.newaxis])
[[-1.4639 -0.8783 -0.2928 0.2928 0.8783 1.4639] [-1.4639 -0.8783 -0.2928 0.2928 0.8783 1.4639]]
# broadcasting for outer product # e.g. create the 12x12 multiplication toable u = np.arange(1, 13) u[:,None] * u[None,:]
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], [ 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24], [ 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36], [ 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48], [ 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60], [ 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72], [ 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84], [ 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96], [ 9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99, 108], [ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120], [ 11, 22, 33, 44, 55, 66, 77, 88, 99, 110, 121, 132], [ 12, 24, 36, 48, 60, 72, 84, 96, 108, 120, 132, 144]])
Calculate the pairwise distance matrix between the following points
- (0,0)
- (4,0)
- (4,3)
- (0,3)
def distance_matrix_py(pts): """Returns matrix of pairwise Euclidean distances. Pure Python version.""" n = len(pts) p = len(pts[0]) m = np.zeros((n, n)) for i in range(n): for j in range(n): s = 0 for k in range(p): s += (pts[i,k] - pts[j,k])**2 m[i, j] = s**0.5 return m
def distance_matrix_np(pts): """Returns matrix of pairwise Euclidean distances. Vectorized numpy version.""" return np.sum((pts[None,:] - pts[:, None])**2, -1)**0.5
pts = np.array([(0,0), (4,0), (4,3), (0,3)])
distance_matrix_py(pts)
array([[ 0., 4., 5., 3.], [ 4., 0., 3., 5.], [ 5., 3., 0., 4.], [ 3., 5., 4., 0.]])
distance_matrix_np(pts)
array([[ 0., 4., 5., 3.], [ 4., 0., 3., 5.], [ 5., 3., 0., 4.], [ 3., 5., 4., 0.]])
# Broaccasting and vectorization is faster than looping %timeit distance_matrix_py(pts) %timeit distance_matrix_np(pts)
1000 loops, best of 3: 203 µs per loop 10000 loops, best of 3: 29.4 µs per loop
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论