Example

发布于 2025-02-25 23:43:38 字数 3929 浏览 0 评论 0 收藏 0

From http://scipy-lectures.github.io/intro/numpy/exercises.html#data-statistics

The data in populations.txt describes the populations of hares and lynxes (and carrots) in northern Canada during 20 years:

Computes and print, based on the data in populations.txt...

The mean and std of the populations of each species for the years in the period.
Which year each species had the largest population.
Which species has the largest population for each year. (Hint: argsort & fancy indexing of np.array([‘H’, ‘L’, ‘C’]))
Which years any of the populations is above 50000. (Hint: comparisons and np.any)
The top 2 years for each species when they had the lowest populations. (Hint: argsort, fancy indexing)
Compare (plot) the change in hare population (see help(np.gradient)) and the number of lynxes - Check correlation (see help(np.corrcoef)).

... all without for-loops.

# download the data locally
if not os.path.exists('populations.txt'):
    ! wget http://scipy-lectures.github.io/_downloads/populations.txt

# peek at the file to see its structure
! head -n 6 populations.txt

# year      hare    lynx    carrot
1900        30e3    4e3     48300
1901        47.2e3  6.1e3   48200
1902        70.2e3  9.8e3   41500
1903        77.4e3  35.2e3  38200
1904        36.3e3  59.4e3  40600

# load data into a numpy array
data = np.loadtxt('populations.txt').astype('int')
data[:5, :]

array([[ 1900, 30000,  4000, 48300],
       [ 1901, 47200,  6100, 48200],
       [ 1902, 70200,  9800, 41500],
       [ 1903, 77400, 35200, 38200],
       [ 1904, 36300, 59400, 40600]])

# provide convenient named variables
populations = data[:, 1:]
year, hare, lynx, carrot = data.T

# The mean and std of the populations of each species for the years in the period
print "Mean (hare, lynx, carrot):", populations.mean(axis=0)
print "Std (hare, lynx, carrot):", populations.std(axis=0)

Mean (hare, lynx, carrot): [ 34080.9524  20166.6667  42400.    ]
Std (hare, lynx, carrot): [ 20897.9065  16254.5915   3322.5062]

# Which year each species had the largest population.
print "Year with largest population (hare, lynx, carrot)",
print year[np.argmax(populations, axis=0)]

Year with largest population (hare, lynx, carrot) [1903 1904 1900]

# Which species has the largest population for each year.
species = ['hare', 'lynx', 'carrot']
zip(year, np.take(species, np.argmax(populations, axis=1)))

[(1900, 'carrot'),
 (1901, 'carrot'),
 (1902, 'hare'),
 (1903, 'hare'),
 (1904, 'lynx'),
 (1905, 'lynx'),
 (1906, 'carrot'),
 (1907, 'carrot'),
 (1908, 'carrot'),
 (1909, 'carrot'),
 (1910, 'carrot'),
 (1911, 'carrot'),
 (1912, 'hare'),
 (1913, 'hare'),
 (1914, 'hare'),
 (1915, 'lynx'),
 (1916, 'carrot'),
 (1917, 'carrot'),
 (1918, 'carrot'),
 (1919, 'carrot'),
 (1920, 'carrot')]

# Which years any of the populations is above 50000
print year[np.any(populations > 50000, axis=1)]

[1902 1903 1904 1912 1913 1914 1915]

# The top 2 years for each species when they had the lowest populations.
print year[np.argsort(populations, axis=0)[:2]]

[[1917 1900 1916]
 [1916 1901 1903]]

plt.plot(year, lynx, 'r-', year, np.gradient(hare), 'b--')
plt.legend(['lynx', 'grad(hare)'], loc='best')
print np.corrcoef(lynx, np.gradient(hare))

[[ 1.     -0.9179]
 [-0.9179  1.    ]]

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

列表为空，暂无数据

Example

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。