如何在 Python 中绘制概率密度函数 (PDF) 图？

发布于 2025-01-10 18:05:06 字数 945 浏览 6 评论 0原文

我想问一下如何用Python绘制概率密度函数（PDF）图。

这是我的代码。

import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats

。

x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)
df

我生成了一个数据框。然后，我尝试绘制 PDF 图表。

df["AGW"].sort_values()
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)

plt.plot(df["AGW"], pdf)

我获得了上图。我做错了什么？您能否告诉我如何绘制概率密度函数（PDF）图，也称为正态分布图。

你能让我知道我需要使用哪些代码（或库）来绘制PDF图表吗？

总是非常感谢！

原文

I'd like to ask how to draw the Probability Density Function (PDF) plot in Python.

This is my codes.

import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats

x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)
df

I generated a data frame. Then, I tried to draw a PDF graph.

df["AGW"].sort_values()
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)

plt.plot(df["AGW"], pdf)

I obtained above graph. What I did wrong? Could you let me how to draw the Probability Density Function (PDF) Plot which is also known as normal distribution graph.

Could you let me know which codes (or library) I need to use to draw the PDF graph?

Always many thanks!!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

灯角 2025-01-17 18:05:06

您只需要对值进行排序（而不是真正检查编辑之后的内容）

pdf = stats.norm.pdf(df["AGW"].sort_values(), df_mean, df_std)

plt.plot(df["AGW"].sort_values(), pdf)

，它就会起作用。

df["AGW"].sort_values() 行不会更改 df。也许您的意思是df.sort_values(by=['AGW'], inplace=True)。
在这种情况下，完整的代码将是：

import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats

x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)

df.sort_values(by=['AGW'], inplace=True)
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)

plt.plot(df["AGW"], pdf)

给出：

编辑：

我认为这里我们已经有了分布（x 是正态分布），所以我们不需要生成 x 的 pdf。由于 pdf 的用途是这样的：

mu = 50
variance = 3
sigma = math.sqrt(variance)
x = np.linspace(mu - 5*sigma, mu + 5*sigma, 1000)
plt.plot(x, stats.norm.pdf(x, mu, sigma))
plt.show()

这里我们不需要从 x 点生成分布，我们只需要绘制我们已有的分布的密度。
所以你可以使用这个：

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
 
x = np.random.normal(50, 3, 1000)  #Generating Data
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source) #Converting to pandas DataFrame
df.plot(kind = 'density'); # or df["AGW"].plot(kind = 'density');

这给出：

如果需要，您可以使用其他软件包，例如seaborn：

import seaborn as sns
plt.figure(figsize = (5,5))
sns.kdeplot(df["AGW"] , bw = 0.5 , fill = True)
plt.show()

或者这样：

import seaborn as sns
sns.set_style("whitegrid")  # Setting style(Optional)
plt.figure(figsize = (10,5)) #Specify the size of figure
sns.distplot(x = df["AGW"]   ,  bins = 10 , kde = True , color = 'teal'
            , kde_kws=dict(linewidth = 4 , color = 'black')) #kde for normal distribution
plt.show()

检查此文章了解更多信息。

You just need to sort the values (not really check what's after edit)

pdf = stats.norm.pdf(df["AGW"].sort_values(), df_mean, df_std)

plt.plot(df["AGW"].sort_values(), pdf)

And it will work.

The line df["AGW"].sort_values() doesn't change df. Maybe you meant df.sort_values(by=['AGW'], inplace=True).
In that case the full code will be :

import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats

x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)

df.sort_values(by=['AGW'], inplace=True)
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)

plt.plot(df["AGW"], pdf)

Which gives :

Edit :

I think here we already have the distribution (x is normally distributed) so we dont need to generate the pdf of x. As the use of the pdf is for something like this :

mu = 50
variance = 3
sigma = math.sqrt(variance)
x = np.linspace(mu - 5*sigma, mu + 5*sigma, 1000)
plt.plot(x, stats.norm.pdf(x, mu, sigma))
plt.show()

Here we dont need to generate the distribution from x points, we only need to plot the density of the distribution we already have .
So you might use this :

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
 
x = np.random.normal(50, 3, 1000)  #Generating Data
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source) #Converting to pandas DataFrame
df.plot(kind = 'density'); # or df["AGW"].plot(kind = 'density');

Which gives :

You might use other packages if you want, like seaborn :

import seaborn as sns
plt.figure(figsize = (5,5))
sns.kdeplot(df["AGW"] , bw = 0.5 , fill = True)
plt.show()

Or this :

import seaborn as sns
sns.set_style("whitegrid")  # Setting style(Optional)
plt.figure(figsize = (10,5)) #Specify the size of figure
sns.distplot(x = df["AGW"]   ,  bins = 10 , kde = True , color = 'teal'
            , kde_kws=dict(linewidth = 4 , color = 'black')) #kde for normal distribution
plt.show()

Check this article for more.

回复收藏 0 原文

~没有更多了~