绘制直方图,使条形高度总和为 1(概率)
我想使用 matplotlib 从向量绘制归一化直方图。我尝试了以下方法:
plt.hist(myarray, normed=True)
以及:
plt.hist(myarray, normed=1)
但两个选项都不会从 [0, 1] 生成 y 轴,以使直方图的条形高度总和为 1。
I'd like to plot a normalized histogram from a vector using matplotlib
. I tried the following:
plt.hist(myarray, normed=True)
as well as:
plt.hist(myarray, normed=1)
but neither option produces a y-axis from [0, 1] such that the bar heights of the histogram sum to 1.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果您希望所有条形的总和等于单位,请按值的总数对每个 bin 进行加权:
Python 2.x 的注意事项:将转换添加到
float()
的运算符之一除法,否则由于整数除法,您最终会得到零If you want the sum of all bars to be equal unity, weight each bin by the total number of values:
Note for Python 2.x: add casting to
float()
for one of the operators of the division as otherwise you would end up with zeros due to integer division如果您提出一个更完整的工作(或在本例中为非工作)示例,将会更有帮助。
我尝试了以下操作:
这确实会生成一个条形图直方图,其 y 轴来自
[0,1]
。此外,根据
hist
文档(即ipython
中的ax.hist?
),我认为总和也很好:尝试一下在执行上述命令之后:
我得到了预期的
1.0
返回值。请记住,normed=True 并不意味着每个条形上的值的总和是统一的,而是表示条形上的积分是统一的。就我而言,np.sum(n)
返回大约7.2767
。It would be more helpful if you posed a more complete working (or in this case non-working) example.
I tried the following:
This will indeed produce a bar-chart histogram with a y-axis that goes from
[0,1]
.Further, as per the
hist
documentation (i.e.ax.hist?
fromipython
), I think the sum is fine too:Giving this a try after the commands above:
I get a return value of
1.0
as expected. Remember thatnormed=True
doesn't mean that the sum of the value at each bar will be unity, but rather than the integral over the bars is unity. In my casenp.sum(n)
returned approx7.2767
.我知道这个答案已经太晚了,因为这个问题是 2010 年提出的,但我遇到这个问题是因为我自己也面临着类似的问题。正如答案中已经指出的,normed=True 意味着直方图下的总面积等于 1,但高度之和不等于 1。但是,为了方便直方图的物理解释,我想制作一个高度总和等于 1。
我在以下问题中找到了提示 - Python:面积标准化为 1 以外的值的直方图
但我无法找到一种方法使条形图模仿 histtype="step" 功能 hist()。这将我转移到: Matplotlib - 已分箱数据的步进直方图
如果社区认为这是可以接受的,我想提出一个综合上述两篇文章的想法的解决方案。
这对我来说非常有效,尽管在某些情况下我注意到直方图最左边的“条”或最右边的“条”不会通过触摸 Y 轴的最低点而关闭。在这种情况下,在 y 的开头或末尾添加元素 0 可以达到必要的结果。
只是想我会分享我的经验。谢谢。
I know this answer is too late considering the question is dated 2010 but I came across this question as I was facing a similar problem myself. As already stated in the answer, normed=True means that the total area under the histogram is equal to 1 but the sum of heights is not equal to 1. However, I wanted to, for convenience of physical interpretation of a histogram, make one with sum of heights equal to 1.
I found a hint in the following question - Python: Histogram with area normalized to something other than 1
But I was not able to find a way of making bars mimic the histtype="step" feature hist(). This diverted me to : Matplotlib - Stepped histogram with already binned data
If the community finds it acceptable I should like to put forth a solution which synthesises ideas from both the above posts.
This has worked wonderfully for me though in some cases I have noticed that the left most "bar" or the right most "bar" of the histogram does not close down by touching the lowest point of the Y-axis. In such a case adding an element 0 at the begging or the end of y achieved the necessary result.
Just thought I'd share my experience. Thank you.
这是使用
np.histogram()
方法的另一种简单解决方案。您确实可以检查总计最多1个以下总和:
Here is another simple solution using
np.histogram()
method.You can indeed check that the total sums up to 1 with:
seaborn.histplot
,或seaborn.displot
与kind='hist'
,并指定stat='probability'
数据
:pandas.DataFrame
,numpy.ndarray
、映射或序列seaborn
是matplotlib
的高级APIpython 3.8.12 中测试、
matplotlib 3.4.3
、seaborn 0.11.2
导入和数据
sns.histplot
sns.displot
seaborn.histplot
, orseaborn.displot
withkind='hist'
, and specifystat='probability'
data
:pandas.DataFrame
,numpy.ndarray
, mapping, or sequenceseaborn
is a high-level API formatplotlib
python 3.8.12
,matplotlib 3.4.3
,seaborn 0.11.2
Imports and Data
sns.histplot
sns.displot
自 matplotlib 3.0.2 起,
normed=True
已弃用。为了获得所需的输出,我必须这样做:尝试同时指定
权重
和密度
作为plt.hist()
的参数对我。如果有人知道在无法访问规范关键字参数的情况下实现该功能的方法,请在评论中告诉我,我将删除/修改此答案。如果你想要 bin 中心,那么不要使用 bins[:-1],它是 bin 边缘 - 你需要选择一个合适的方案来计算中心(这可能是也可能不是微不足道的)衍生的)。
Since matplotlib 3.0.2,
normed=True
is deprecated. To get the desired output I had to do:Trying to specify
weights
anddensity
simultaneously as arguments toplt.hist()
did not work for me. If anyone know of a way to get that working without having access to the normed keyword argument then please let me know in the comments and I will delete/modify this answer.If you want bin centres then don't use
bins[:-1]
which are the bin edges - you need to choose a suitable scheme for how to calculate the centres (which may or may not be trivially derived).