如何在 MATLAB 中标准化直方图?
如何对直方图进行归一化,使概率密度函数下的面积等于 1?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
如何对直方图进行归一化,使概率密度函数下的面积等于 1?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(7)
我对此的回答与对您的 之前的问题。对于概率密度函数,整个空间的积分为 1。除以总和不会给出正确的密度。要获得正确的密度,必须除以面积。为了说明我的观点,请尝试以下示例。
您可以亲自查看哪种方法与正确答案一致(红色曲线)。
另一种标准化直方图的方法(比方法 2 更简单)是除以
sum( f * dx)
表示概率密度函数的积分,即My answer to this is the same as in an answer to your earlier question. For a probability density function, the integral over the entire space is 1. Dividing by the sum will not give you the correct density. To get the right density, you must divide by the area. To illustrate my point, try the following example.
You can see for yourself which method agrees with the correct answer (red curve).
Another method (more straightforward than method 2) to normalize the histogram is to divide by
sum(f * dx)
which expresses the integral of the probability density function, i.e.自 2014b 起,Matlab 将这些标准化例程原生嵌入
直方图
函数中(请参阅帮助文件 该函数提供的 6 个例程)。以下是使用 PDF 标准化 的示例(所有 bin 的总和为 1)。相应的 PDF 是
两者一起给出
一项改进很可能是由于实际问题和接受的答案的成功!
编辑 -
hist 的使用
和histc
不推荐 现在,应该使用直方图
来代替。请注意,使用此新函数创建 bin 的 6 种方法都不会产生hist
和histc
生成的 bin。有一个 Matlab 脚本可以更新以前的代码以适应histogram
的调用方式(bin 边缘而不是 bin 中心 - 链接)。通过这样做,我们可以比较@abcd(trapz
和sum
)和Matlab(pdf
)。3
pdf
标准化方法给出几乎相同的结果(在eps
范围内)。测试:
新的 PDF 标准化与前一个标准化之间的最大差异为 5.5511e-17。
Since 2014b, Matlab has these normalization routines embedded natively in the
histogram
function (see the help file for the 6 routines this function offers). Here is an example using the PDF normalization (the sum of all the bins is 1).The corresponding PDF is
The two together gives
An improvement that might very well be due to the success of the actual question and accepted answer!
EDIT - The use of
hist
andhistc
is not recommended now, andhistogram
should be used instead. Beware that none of the 6 ways of creating bins with this new function will produce the binshist
andhistc
produce. There is a Matlab script to update former code to fit the wayhistogram
is called (bin edges instead of bin centers - link). By doing so, one can compare thepdf
normalization methods of @abcd (trapz
andsum
) and Matlab (pdf
).The 3
pdf
normalization method give nearly identical results (within the range ofeps
).TEST:
The maximum difference between the new PDF normalization and the former one is 5.5511e-17.
hist
不仅可以绘制直方图,还可以返回每个 bin 中的元素计数,因此您可以获得该计数,通过将每个 bin 除以总数来标准化它,并使用绘制结果栏
。示例:或者如果您想要一行:
文档:
编辑:此解决方案回答了问题如何获得所有垃圾箱的总和等于 1。仅当您的 bin 大小相对于数据方差较小时,此近似值才有效。这里使用的和对应于一个简单的求积公式,可以使用更复杂的公式,例如RM提出的
trapz
hist
can not only plot an histogram but also return you the count of elements in each bin, so you can get that count, normalize it by dividing each bin by the total and plotting the result usingbar
. Example:or if you want a one-liner:
Documentation:
Edit: This solution answers the question How to have the sum of all bins equal to 1. This approximation is valid only if your bin size is small relative to the variance of your data. The sum used here correspond to a simple quadrature formula, more complex ones can be used like
trapz
as proposed by R. M.每个单独条形的面积为高度*宽度。由于 MATLAB 将为条形选择等距点,因此宽度为:
现在,如果我们将所有单个条形相加,则总面积将如下所示
因此,正确缩放的图可以通过以下方式获得
The area for each individual bar is height*width. Since MATLAB will choose equidistant points for the bars, so the width is:
Now if we sum up all the individual bars the total area will come out as
So the correctly scaled plot is obtained by
abcd 的 PDF 区域不是一个,正如许多评论中指出的那样,这是不可能的。
这里许多答案中所做的假设
pdf
下的概率应为 1。归一化应以probability
的Normalization
方式完成,而不是使用Normalization
的方式进行code>pdf,在 histogram() 和 hist() 中。图1 hist()方法的输出,图2 histogram()方法的输出
两种方法之间的最大幅度不同,这表明 hist() 的方法存在一些错误,因为 histogram() 的方法使用标准归一化。
我认为 hist() 方法的错误在于部分归一化为 pdf,而不是完全归一化为概率。
使用 hist() 的代码 [已弃用]
一些备注
Nbins
,sum(f)/N
给出1
。g
中 bin 的宽度 (dx
)代码
输出如图 1 所示。
使用 histogram() 的代码
一些备注
Nbins,则 >sum(f) 为
1
,b)sum(f)/N<如果
Nbins
是手动设置且未标准化,则 /code> 为 1。dx
),pdf 需要图 2 中的
g
代码并且满足预期输出:面积 1.0000。
Matlab:2016a
系统:Linux Ubuntu 16.04 64位
Linux内核4.6
The area of abcd`s PDF is not one, which is impossible like pointed out in many comments.
Assumptions done in many answers here
pdf
should be 1. The normalization should be done asNormalization
withprobability
, not asNormalization
withpdf
, in histogram() and hist().Fig. 1 Output of hist() approach, Fig. 2 Output of histogram() approach
The max amplitude differs between two approaches which proposes that there are some mistake in hist()'s approach because histogram()'s approach uses the standard normalization.
I assume the mistake with hist()'s approach here is about the normalization as partially
pdf
, not completely asprobability
.Code with hist() [deprecated]
Some remarks
sum(f)/N
gives1
ifNbins
manually set.dx
) in the graphg
Code
Output is in Fig. 1.
Code with histogram()
Some remarks
sum(f)
is1
ifNbins
adjusted with histogram()'s Normalization as probability, b)sum(f)/N
is 1 ifNbins
is manually set without normalization.dx
) in the graphg
Code
Output in Fig. 2 and expected output is met: area 1.0000.
Matlab: 2016a
System: Linux Ubuntu 16.04 64 bit
Linux kernel 4.6
对于某些分布,柯西我认为,我发现 trapz 会高估面积,因此 pdf 会根据您选择的 bin 数量而变化。在这种情况下我会这样做
For some Distributions, Cauchy I think, I have found that trapz will overestimate the area, and so the pdf will change depending on the number of bins you select. In which case I do
有一个关于 MATLAB 中的直方图调整的优秀指南 (原始链接已损坏,archive.org 链接),
第一部分是直方图拉伸。
There is an excellent three part guide for Histogram Adjustments in MATLAB (broken original link, archive.org link),
the first part is on Histogram Stretching.