如何创建密度图

发布于 2024-10-01 20:32:36 字数 771 浏览 2 评论 0原文

在 RI 中,可以通过执行以下操作来创建所需的输出:

data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
         rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))

R 中的密度图

在 python(使用 matplotlib)中,我得到的最接近的是一个简单的直方图:

import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()

Histogram in matplotlib

我也尝试了 normed=True 参数,但除了尝试将高斯拟合到直方图之外,无法获得任何其他信息。

我最近的尝试是围绕 scipy.stats 和 gaussian_kde 进行的,遵循网络上的示例,但到目前为止我还没有成功。

In R I can create the desired output by doing:

data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
         rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))

Density plot in R

In python (with matplotlib) the closest I got was with a simple histogram:

import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()

Histogram in matplotlib

I also tried the normed=True parameter but couldn't get anything other than trying to fit a gaussian to the histogram.

My latest attempts were around scipy.stats and gaussian_kde, following examples on the web, but I've been unsuccessful so far.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

命比纸薄 2024-10-08 20:32:36

五年后,当我谷歌“如何使用 python 创建核密度图”时,这个线程仍然出现在顶部!

今天,更简单的方法是使用 seaborn,这个包提供了许多方便的绘图功能和良好的功能。风格管理。

import numpy as np
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.set_style('whitegrid')
sns.kdeplot(np.array(data), bw=0.5)

在此处输入图像描述

Five years later, when I Google "how to create a kernel density plot using python", this thread still shows up at the top!

Today, a much easier way to do this is to use seaborn, a package that provides many convenient plotting functions and good style management.

import numpy as np
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.set_style('whitegrid')
sns.kdeplot(np.array(data), bw=0.5)

enter image description here

以为你会在 2024-10-08 20:32:36

Sven 展示了如何使用 Scipy 中的 gaussian_kde 类,但您会注意到它看起来不太像您使用 R 生成的类。这是因为 gaussian_kde 尝试自动推断带宽。您可以通过更改 gaussian_kde 类的函数 covariance_factor 来调整带宽。首先,这是在不更改该函数的情况下得到的结果:

alt text

但是,如果我使用以下代码:

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()

我得到

< img src="https://i.sstatic.net/kPXVJ.png" alt="alt text">

这与您从 R 获得的内容非常接近。我做了什么? gaussian_kde 使用可变函数 covariance_factor 来计算其带宽。在更改函数之前,该数据的 covariance_factor 返回的值约为 0.5。降低此值会降低带宽。更改该函数后,我必须调用_compute_covariance,以便正确计算所有因子。它与 R 中的 bw 参数并不完全对应,但希望它可以帮助您找到正确的方向。

Sven has shown how to use the class gaussian_kde from Scipy, but you will notice that it doesn't look quite like what you generated with R. This is because gaussian_kde tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function covariance_factor of the gaussian_kde class. First, here is what you get without changing that function:

alt text

However, if I use the following code:

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()

I get

alt text

which is pretty close to what you are getting from R. What have I done? gaussian_kde uses a changable function, covariance_factor to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call _compute_covariance after changing that function so that all of the factors would be calculated correctly. It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.

橙味迷妹 2024-10-08 20:32:36

选项 1:

使用 pandas 数据框图(构建于 matplotlib 之上):

import pandas as pd
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
pd.DataFrame(data).plot(kind='density') # or pd.Series()

在此处输入图像描述

选项 2:< /strong>

使用 seaborndistplot

import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.distplot(data, hist=False)

在此处输入图像描述

Option 1:

Use pandas dataframe plot (built on top of matplotlib):

import pandas as pd
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
pd.DataFrame(data).plot(kind='density') # or pd.Series()

enter image description here

Option 2:

Use distplot of seaborn:

import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.distplot(data, hist=False)

enter image description here

呆头 2024-10-08 20:32:36

也许尝试类似的方法:

import matplotlib.pyplot as plt
import numpy
from scipy import stats
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = stats.kde.gaussian_kde(data)
x = numpy.arange(0., 8, .1)
plt.plot(x, density(x))
plt.show()

您可以轻松地用不同的核密度估计替换 gaussian_kde() 。

Maybe try something like:

import matplotlib.pyplot as plt
import numpy
from scipy import stats
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = stats.kde.gaussian_kde(data)
x = numpy.arange(0., 8, .1)
plt.plot(x, density(x))
plt.show()

You can easily replace gaussian_kde() by a different kernel density estimate.

一个人练习一个人 2024-10-08 20:32:36

密度图也可以使用 matplotlib 创建:
函数 plt.hist(data) 返回密度图所需的 y 和 x 值(请参阅文档 https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html)。
因此,以下代码使用 matplotlib 库创建密度图:

import matplotlib.pyplot as plt
dat=[-1,2,1,4,-5,3,6,1,2,1,2,5,6,5,6,2,2,2]
a=plt.hist(dat,density=True)
plt.close()
plt.figure()
plt.plot(a[1][1:],a[0])      

此代码返回以下密度图

在此处输入图像描述

The density plot can also be created by using matplotlib:
The function plt.hist(data) returns the y and x values necessary for the density plot (see the documentation https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html).
Resultingly, the following code creates a density plot by using the matplotlib library:

import matplotlib.pyplot as plt
dat=[-1,2,1,4,-5,3,6,1,2,1,2,5,6,5,6,2,2,2]
a=plt.hist(dat,density=True)
plt.close()
plt.figure()
plt.plot(a[1][1:],a[0])      

This code returns the following density plot

enter image description here

感情废物 2024-10-08 20:32:36

你可以这样做:

s = np.random.normal(2, 3, 1000)
import matplotlib.pyplot as plt
count, bins, ignored = plt.hist(s, 30, density=True)
plt.plot(bins, 1/(3 * np.sqrt(2 * np.pi)) * np.exp( - (bins - 2)**2 / (2 * 3**2) ), 
linewidth=2, color='r')
plt.show()

You can do something like:

s = np.random.normal(2, 3, 1000)
import matplotlib.pyplot as plt
count, bins, ignored = plt.hist(s, 30, density=True)
plt.plot(bins, 1/(3 * np.sqrt(2 * np.pi)) * np.exp( - (bins - 2)**2 / (2 * 3**2) ), 
linewidth=2, color='r')
plt.show()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文