如何创建密度图
在 RI 中,可以通过执行以下操作来创建所需的输出:
data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))
在 python(使用 matplotlib)中,我得到的最接近的是一个简单的直方图:
import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()
我也尝试了 normed=True 参数,但除了尝试将高斯拟合到直方图之外,无法获得任何其他信息。
我最近的尝试是围绕 scipy.stats 和 gaussian_kde 进行的,遵循网络上的示例,但到目前为止我还没有成功。
In R I can create the desired output by doing:
data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))
In python (with matplotlib) the closest I got was with a simple histogram:
import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()
I also tried the normed=True parameter but couldn't get anything other than trying to fit a gaussian to the histogram.
My latest attempts were around scipy.stats
and gaussian_kde
, following examples on the web, but I've been unsuccessful so far.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
五年后,当我谷歌“如何使用 python 创建核密度图”时,这个线程仍然出现在顶部!
今天,更简单的方法是使用 seaborn,这个包提供了许多方便的绘图功能和良好的功能。风格管理。
Five years later, when I Google "how to create a kernel density plot using python", this thread still shows up at the top!
Today, a much easier way to do this is to use seaborn, a package that provides many convenient plotting functions and good style management.
Sven 展示了如何使用 Scipy 中的
gaussian_kde
类,但您会注意到它看起来不太像您使用 R 生成的类。这是因为gaussian_kde
尝试自动推断带宽。您可以通过更改gaussian_kde
类的函数covariance_factor
来调整带宽。首先,这是在不更改该函数的情况下得到的结果:但是,如果我使用以下代码:
我得到
< img src="https://i.sstatic.net/kPXVJ.png" alt="alt text">
这与您从 R 获得的内容非常接近。我做了什么?
gaussian_kde
使用可变函数covariance_factor
来计算其带宽。在更改函数之前,该数据的 covariance_factor 返回的值约为 0.5。降低此值会降低带宽。更改该函数后,我必须调用_compute_covariance
,以便正确计算所有因子。它与 R 中的 bw 参数并不完全对应,但希望它可以帮助您找到正确的方向。Sven has shown how to use the class
gaussian_kde
from Scipy, but you will notice that it doesn't look quite like what you generated with R. This is becausegaussian_kde
tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the functioncovariance_factor
of thegaussian_kde
class. First, here is what you get without changing that function:However, if I use the following code:
I get
which is pretty close to what you are getting from R. What have I done?
gaussian_kde
uses a changable function,covariance_factor
to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call_compute_covariance
after changing that function so that all of the factors would be calculated correctly. It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.选项 1:
使用
pandas
数据框图(构建于matplotlib
之上):选项 2:< /strong>
使用
seaborn
的distplot
:Option 1:
Use
pandas
dataframe plot (built on top ofmatplotlib
):Option 2:
Use
distplot
ofseaborn
:也许尝试类似的方法:
您可以轻松地用不同的核密度估计替换 gaussian_kde() 。
Maybe try something like:
You can easily replace
gaussian_kde()
by a different kernel density estimate.密度图也可以使用 matplotlib 创建:
函数 plt.hist(data) 返回密度图所需的 y 和 x 值(请参阅文档 https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html)。
因此,以下代码使用 matplotlib 库创建密度图:
此代码返回以下密度图
The density plot can also be created by using matplotlib:
The function plt.hist(data) returns the y and x values necessary for the density plot (see the documentation https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html).
Resultingly, the following code creates a density plot by using the matplotlib library:
This code returns the following density plot
你可以这样做:
You can do something like: