numpy.histogram() 是如何工作的?

发布于 2025-01-02 01:18:25 字数 359 浏览 1 评论 0原文

在阅读 numpy 时,我遇到了函数 numpy.histogram()

它的用途是什么?它是如何工作的?在文档中他们提到垃圾箱:它们是什么?

一些谷歌搜索让我找到了直方图的一般定义。我明白了。但不幸的是,我无法将这些知识与文档中给出的示例联系起来。

While reading up on numpy, I encountered the function numpy.histogram().

What is it for and how does it work? In the docs they mention bins: What are they?

Some googling led me to the definition of Histograms in general. I get that. But unfortunately I can't link this knowledge to the examples given in the docs.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

三五鸿雁 2025-01-09 01:18:25

bin 是表示直方图单个条形沿 X 轴的宽度的范围。您也可以将其称为间隔。 (维基百科将它们更正式地定义为“不相交类别”。)

Numpy histogram 函数不会绘制直方图,但它会计算每个 bin 内输入数据的出现次数,从而确定每个条形的面积(如果箱宽度不相等,则不一定是高度)。

在此示例中:

 np.histogram([1, 2, 1], bins=[0, 1, 2, 3])

有 3 个 bin,分别表示值范围为 0 到 1(不包括 1)、1 到 2(不包括 2)和 2 到 3(包括 3)。在本例中,Numpy 定义这些 bin 的方式是通过给出分隔符列表 ([0, 1, 2, 3]),尽管它也会返回结果中的 bin,因为它可以选择它们如果未指定,则自动从输入中获取。例如,如果 bins=5,它将在最小输入值和最大输入值之间使用 5 个等宽的 bin。

输入值为 1、2 和 1。因此,bin“1 to 2”包含两次出现(两个 1 值),bin“2 to 3”包含一次出现(2)。这些结果位于返回元组的第一项中:array([0, 2, 1])

由于此处的 bin 宽度相等,因此您可以使用出现次数作为每个条形的高度。 看到

  • 绘制时,您将在 X 轴上
  • 一个高度为 0 的条形,表示范围/bin [0,1],一个高度为 2 的条形,表示范围/bin [1,2],
  • 一个高度为 1 的条形,表示范围/箱[2,3]。

您可以直接使用 Matplotlib 绘制此图(其 hist 函数还返回 bin 和值):

>>> import matplotlib.pyplot as plt
>>> plt.hist([1, 2, 1], bins=[0, 1, 2, 3])
(array([0, 2, 1]), array([0, 1, 2, 3]), <a list of 3 Patch objects>)
>>> plt.show()

A bin is range that represents the width of a single bar of the histogram along the X-axis. You could also call this the interval. (Wikipedia defines them more formally as "disjoint categories".)

The Numpy histogram function doesn't draw the histogram, but it computes the occurrences of input data that fall within each bin, which in turns determines the area (not necessarily the height if the bins aren't of equal width) of each bar.

In this example:

 np.histogram([1, 2, 1], bins=[0, 1, 2, 3])

There are 3 bins, for values ranging from 0 to 1 (excl 1.), 1 to 2 (excl. 2) and 2 to 3 (incl. 3), respectively. The way Numpy defines these bins if by giving a list of delimiters ([0, 1, 2, 3]) in this example, although it also returns the bins in the results, since it can choose them automatically from the input, if none are specified. If bins=5, for example, it will use 5 bins of equal width spread between the minimum input value and the maximum input value.

The input values are 1, 2 and 1. Therefore, bin "1 to 2" contains two occurrences (the two 1 values), and bin "2 to 3" contains one occurrence (the 2). These results are in the first item in the returned tuple: array([0, 2, 1]).

Since the bins here are of equal width, you can use the number of occurrences for the height of each bar. When drawn, you would have:

  • a bar of height 0 for range/bin [0,1] on the X-axis,
  • a bar of height 2 for range/bin [1,2],
  • a bar of height 1 for range/bin [2,3].

You can plot this directly with Matplotlib (its hist function also returns the bins and the values):

>>> import matplotlib.pyplot as plt
>>> plt.hist([1, 2, 1], bins=[0, 1, 2, 3])
(array([0, 2, 1]), array([0, 1, 2, 3]), <a list of 3 Patch objects>)
>>> plt.show()

enter image description here

世界和平 2025-01-09 01:18:25
import numpy as np    
hist, bin_edges = np.histogram([1, 1, 2, 2, 2, 2, 3], bins = range(5))

下面,hist 表示 bin #0 中有 0 个项目,bin #1 有 2 个项目,bin #3 有 4 个项目,bin #4 有 1 个项目。

print(hist)
# array([0, 2, 4, 1])   

bin_edges 表示 bin #0 是区间 [0,1), bin #1 是 [1,2), ...,
bin #3 是 [3,4](最后一个 bin 包括最右边缘)。

print (bin_edges)
# array([0, 1, 2, 3, 4]))  

使用上面的代码,将输入更改为 np.histogram 并查看它是如何工作的。


但一张图片胜过一千个单词:

import matplotlib.pyplot as plt
plt.bar(bin_edges[:-1], hist, width = 1)
plt.xlim(min(bin_edges), max(bin_edges))
plt.show()   

在此处输入图像描述

import numpy as np    
hist, bin_edges = np.histogram([1, 1, 2, 2, 2, 2, 3], bins = range(5))

Below, hist indicates that there are 0 items in bin #0, 2 in bin #1, 4 in bin #3, 1 in bin #4.

print(hist)
# array([0, 2, 4, 1])   

bin_edges indicates that bin #0 is the interval [0,1), bin #1 is [1,2), ...,
bin #3 is [3,4] (the last bin is including the rightmost edge).

print (bin_edges)
# array([0, 1, 2, 3, 4]))  

Play with the above code, change the input to np.histogram and see how it works.


But a picture is worth a thousand words:

import matplotlib.pyplot as plt
plt.bar(bin_edges[:-1], hist, width = 1)
plt.xlim(min(bin_edges), max(bin_edges))
plt.show()   

enter image description here

虐人心 2025-01-09 01:18:25

使用 numpy.histogram 做的另一个有用的事情是将输出绘制为折线图上的 x 和 y 坐标。例如:

arr = np.random.randint(1, 51, 500)
y, x = np.histogram(arr, bins=np.arange(51))
fig, ax = plt.subplots()
ax.plot(x[:-1], y)
fig.show()

在此处输入图像描述

这可能是一种可视化直方图的有用方法,您希望在其中没有到处都有条形图的情况下获得更高级别的粒度。在图像直方图中非常有用,可识别极端像素值。

Another useful thing to do with numpy.histogram is to plot the output as the x and y coordinates on a linegraph. For example:

arr = np.random.randint(1, 51, 500)
y, x = np.histogram(arr, bins=np.arange(51))
fig, ax = plt.subplots()
ax.plot(x[:-1], y)
fig.show()

enter image description here

This can be a useful way to visualize histograms where you would like a higher level of granularity without bars everywhere. Very useful in image histograms for identifying extreme pixel values.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文