python:使用log x轴的直方图不匹配对数转换值的匹配直方图
以下是让我困惑的最小例子。
首先,我们将创建一个遵循对数正常分布和关联概率密度函数的值的随机样本。然后用log x轴绘制直方图。
import numpy as np
import matplotlib.pyplot as plt
mu, sigma = 0.5, 0.25 # mean and standard deviation
s = np.random.lognormal(mu, sigma, 10000)
count, bins, ignored = plt.hist(s, 100, density=True, align='mid')
x = np.linspace(min(bins), max(bins), 10000)
pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2))
/ (x * sigma * np.sqrt(2 * np.pi)))
plt.plot(x, pdf, linewidth=2, color='r')
plt.xscale('log')
plt.axis('tight')
plt.show()
现在,我们将记录分布的值并无需log x轴
plt.hist(np.log(s), 100, density=True, align='mid')
plt.plot(np.log(x), pdf, linewidth=2, color='r')
plt.axis('tight')
plt.show()
< img src =“ https://i.sstatic.net/m7kjc.png” alt =“线性轴但已记录值”>,
所以我的问题是为什么PDF在我使用日志轴时匹配直方图bin值,而在我使用log axis时则不匹配我记录值然后绘制直方图?
Below is a minimal example of something that puzzled me.
First we will create a random sample of values that follow a log normal distribution and the associated probability density function. Then plot a histogram with a log x-axis.
import numpy as np
import matplotlib.pyplot as plt
mu, sigma = 0.5, 0.25 # mean and standard deviation
s = np.random.lognormal(mu, sigma, 10000)
count, bins, ignored = plt.hist(s, 100, density=True, align='mid')
x = np.linspace(min(bins), max(bins), 10000)
pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2))
/ (x * sigma * np.sqrt(2 * np.pi)))
plt.plot(x, pdf, linewidth=2, color='r')
plt.xscale('log')
plt.axis('tight')
plt.show()
Now we will log the values of the distribution and plot the without a log x-axis
plt.hist(np.log(s), 100, density=True, align='mid')
plt.plot(np.log(x), pdf, linewidth=2, color='r')
plt.axis('tight')
plt.show()
So my question is why does the pdf match the histogram bin values when I use a log axis but not when I log the values and then plot the histogram?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要纠正日志变换,要解决此问题,我只需更改代码即可将PDF绘制到:
此背后的直觉是,随着X的增加,数据转换数据将导致垃圾箱获得“更宽”。具体来说,您需要查看转换的衍生物
you need to correct for your log transform, to fix this I'd just change the code to plot the PDF to:
the intuition behind this is that log transforming the data will cause the bins to get "wider" as x increases. specifically, you need to look at the derivative of your transform